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sjc  sjc  sjc  sjte  ijc  sjc  sfc  sjc  DISCLAIMER  ********* 

The  U.S.  Bureau  of  the  Census  (Census)  and  the  National  Institute  of  Standards  and 
Technology  (NIST)  sponsored  this  Conference  as  part  of  ongoing  research  into  machine 
recognition  of  hand-print.  The  Conference  and  related  exercises  focused  on  a single  step  in 
the  process:  machine  recognition  of  individual  (or  segmented)  characters  without  context. 
With  the  single  variable  nature  of  this  study,  no  valid  comparisons  can  be  made  regarding 
cost  or  performance  of  systems  designed  to  process  entire  forms  or  documents.  Further,  the 
efforts  of  the  participants  in  conducting  the  tests  were  not  proctored  or  monitored  in  any 
way  by  Census  or  NIST. 

While  some  test  results  from  this  Conference  may  appear  in  marketing  literature,  poten- 
tial buyers  must  beware!  Census  and  NIST  can  make  only  one  recommendation  to  potential 
buyers:  use  your  own  application-specific  data  to  thoroughly  test  the  performance  of  any 
system  (or  component)  in  a realistic  setting. 

Also,  reference  is  made  to  some  commercial  products  at  various  points  in  this  report. 
Such  reference  constitutes  neither  endorsement  by  Census  or  NIST,  nor  imphcation  that  the 
product  so  referenced  is  the  best  for  the  particular  application. 
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1 Executive  Summary 


Bob  Hammond 

1.1  Background 

Since  1790,  the  United  States  has  conducted  a decennial  census,  or  head  count,  of  the 
American  population.  Over  the  last  century,  growth  in  the  population  and  demand  for 
quicker  tabulations  have  presented  very  strenuous  tasks  for  data  capture  and  information 
technology.  In  the  late  1800’s,  tabulating  machines  with  punched  cards  were  invented  for 
Census  use.  In  the  1950’s,  staff  at  Census  and  NBS  helped  develop  the  UNIVAC  for  general 
purpose  computing.  About  the  same  time,  they  jointly  developed  the  first  optical  scanning 
device  for  high  speed  mark  recognition  of  microfilm. 

For  almost  three  decades,  staff  at  the  Census  Bureau  have  heard  claims  that  machine  recog- 
nition of  handwriting  was  just  around  the  technological  corner.  However,  a careful  review  of 
most  claims  showed  that  the  corner  was  still  a long  way  off.  Recent  advances  in  recognition 
of  machine  print  and  improvements  in  microprocessor  performance  have  renewed  optimism 
for  machine  recognition  of  hand  print.  In  the  late  1980’s,  the  Census  Bureau  enlisted  the 
Image  Recognition  Group  (IRG)  at  the  National  Institute  for  Standards  and  Technology 
(NIST)  to  help  evaluate  these  claims  more  closely. 

1.2  The  Conference 

.A.fter  several  years  of  research,  the  NIST /IRG  had  developed  a working  prototype  along  with 
various  methods  to  measure  the  performance  of  other  systems.  .A.fter  the  1990  Census,  NIST 
and  Census  decided  to  sponsor  a scientific  experiment  and  conference  (hereafter  referred  to 
as  the  Conference)  to  determine  the  state  of  the  art  in  this  industry.  NIST  and  Census 
formed  a Committee  having  representatives  from  government,  industry,  and  academia  to 
organize  the  Conference,  and  NIST  personnel  ran  the  Conference. 

Twenty  nine  different  groups  from  North  America  and  Europe  responded  to  the  call  for  par- 
ticipation. Each  party  received  an  image  data  base  of  segmented,  hand-printed,  alpha  and 
numeric  characters  for  training  their  systems.  Later,  each  party  received  a similar  database 
for  test  purposes.  Each  attempted  to  recognize  the  characters,  and  all  but  three  submitted 
their  results  to  NIST  for  scoring.  In  late  May  1992,  ail  parties  that  submitted  results  con- 
vened in  Gaithersburg,  Maryland  to  discuss  the  results.  Scientific  and  academic  participation 
was  encouraged,  and  marketing  interests  were  discouraged.  Attendance  was  strictly  limited 
to  sponsors,  participants,  and  up  to  two  associates  designated  by  each  participant,  along 
with  a few  observers  from  federal  agencies  (FBI,  IRS,  USPS)  that  are  currently  sponsoring 
work  in  the  field. 

The  Conference  and  related  exercises  focused  on  a single  step  in  the  process:  machine  recog- 
nition of  individual  (or  segmented)  characters  with  no  context.  With  the  single  variable 
nature  of  this  study,  no  valid  comparisons  can  be  made  regarding  cost  or  performance  of 
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systems  designed  to  process  entire  forms  or  documents.  Further,  the  efforts  of  participants 
were  not  proctored  or  monitored  in  any  way  by  Census  or  NIST  staff. 


1.3  Conclusions 

NIST  and  Census  are  in  no  way  responsible  for  how  these  results  may  be  used.  NIST  made 
every  effort  to  assure  the  accuracy  of  the  mea.sures  computed  from  the  submissions  by  the 
participants.  Nevertheless,  NIST  and  Census  are  aware  that  different  tests,  which  may  be 
more  pertinent  to  real  applications,  might  give  different  results  than  those  reported  here,  and 
that  other  analyses  of  the  submissions  might  give  more  complete  results  than  those  reported 
here. 

While  some  results  from  this  Conference  may  appear  in  marketing  literature,  under  no  cir- 
cumstances should  potential  buyers  use  data  from  this  study  as  a primary  basis  for  purchasing 
decisions.  Census  and  NIST  can  make  only  one  recommendation  to  potential  buyers;  use 
your  own  application-specific  data  to  thoroughly  test  the  performance  of  any  system  (or 
component)  in  a realistic  setting. 

The  Conference  resulted  in  the  following  general  conclusions: 

.About  half  of  the  systems  correctly  recognized  over  95%  of  the  digits,  over  90%  of  the  upper 
case  letters,  and  over  80%  of  the  lower  case  letters  in  the  test.  For  comparison,  a human 
correctly  recognized  about  98.5%  of  the  test  digits.  (Chapters  3 and  4 discuss  the  test  data, 
scoring,  and  error  rates  in  detail.) 

While  machine  recognition  of  segmented  digits  appears  to  be  approaching  the  level  of  human 
performance,  one  should  not  extrapolate  this  conclusion  to  the  performance  on  unconstrained 
input,  which  is  a much  more  difficult  problem. 

Further  research,  development,  and  testing  on  realistic  sources  of  hand-printing  is  needed  to 
determine  the  cost  and  practicality  of  this  technology.  Many  participants  said  they  learned  of 
new  techniques  at  the  Conference  that  will  help  them  improve  their  system’s  performances, 
but  potential  buyers  should  use  their  unique,  application-specific  data  to  thoroughly  test 
the  performance  of  any  system  (or  component)  in  a realistic  setting.  Discussions  about 
differences  in  the  training  data  and  the  test  data  suggests  that  various  systems  may  perform 
differently  with  only  slight  changes  in  the  source  data. 

For  scientific  reasons,  future  efforts  to  measure  performance  of  competing  systems  would 
isolate  the  source  and  extent  of  error  resulting  from  each  step  of  the  recognition  process. 
However,  given  the  wide  variety  of  approaches  to  various  steps  in  the  process,  this  incremental 
approach  to  the  research  may  be  impractical. 


1.4  Organization  of  this  Report 

The  Introduction  in  Chapter  2 provides  an  overview  of  the  Conference,  materials,  theories, 
and  methods  used  in  the  effort.  Chapters  3 and  4 discuss  in  more  technical  detail  the  theory 
and  results  of  metrics  used  for  scoring  and  evaluation  of  results.  Chapter  5 attempts  to 
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lay  out  a taxonomy  of  approaches  to  optical  character  recognition  systems,  while  Chapter  6 
qualifies  and  discusses  various  considerations  on  the  speed  measures  of  the  Conference.  The 
various  appendices  offer  original  copy  from  the  initial  call  for  participation  and  instructional 
material,  discussions  of  lessons  learned  (and  problems  encountered),  and  detailed  descriptions 
and  scoring  results  of  all  118  submissions. 
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2 Introduction 


Jon  Geist 

The  goals  of  the  First  Census  Optical  Character  Recognition  (OCR)  Systems  Conference 
were  scientific  in  nature.  The  first  goal  was  to  gauge  the  state  of  the  art  of  OCR  of  hand- 
printed characters  with  respect  to  the  particular  problems  associated  with  entering  census 
data  into  a computer  database.  The  second  was  to  learn  what  is  currently  limiting  the  state 
of  the  art.  The  third  goal  was  to  determine  whether  new  databases  of  handprinted  characters 
for  use  either  in  training  or  in  testing  could  be  expected  to  help  to  improve  the  state  of  the 
art  of  OCR  for  applications  such  as  the  census,  and  if  so,  what  types  of  new  databases  are 
needed. 

It  was  decided  that  a test  open  to  organizations  having  strong  OCR  programs  would  be  a 
cost-efficient  tool  for  meeting  these  goals.  This  would  allow  comparison  of  the  results  from 
a wide  variety  of  systems,  algorithms,  features,  and  preprocessing.  Unfortunately,  it  would 
not  be  possible  to  control  the  variables  as  well  as  might  otherwise  be  desirable  with  this  type 
of  experiment,  but  comparison  of  the  results  from  a broad  range  of  systems  was  thought  to 
be  more  important  than  comparison  of  the  results  obtained  from  different  variations  of  a 
single  type  of  system. 

The  full  Census  OCR  task  consists  of  document  handling,  form  identification,  field  isolation, 
character  segmentation,  character  recognition,  and  context-based  field  correction.  However, 
the  recognition  of  segmented  characters  has  been  considered  the  bellwether  of  OCR  progress 
for  some  time.  Therefore,  it  seemed  desirable  to  limit  the  test  to  this  subtask,  both  to 
establish  a baseline  for  this  capability  before  considering  more  complex  combinations  of 
subtasks,  and  to  test  recent  thinking  that  the  recognition  of  segmented  characters  is  no 
longer  the  accuracy-limiting  subtask  for  hand-print  OCR. 

This  decision  required  postponing  tests  that  are  more  typical  of  the  full  Census  OCR  task 
for  future  conferences.  The  test  that  was  implemented  consisted  of  classifying  about  85000 
binary  images  of  segmented  characters  that  were  distributed  on  a CD-ROM.  All  participants 
received  identical  tests,  and  none  had  seen  any  of  the  images  on  the  CD  before  receiving  it. 

The  Conference  had  no  marketing  goals.  In  particular,  the  test  was  not  proctored,  and 
neither  the  capabilities  tested  nor  the  test  images  were  representative  of  any  commercial 
application.  Also,  participants  were  implicitly  encouraged  to  carry  out  experiments  that 
promoted  the  scientific  goals  of  the  Conference,  but  which  might  not  contribute  to  optimum 
system  performance.  For  instance,  a training  CD  that  contained  over  300000  binary  images 
of  segmented  characters  was  provided  to  the  participants  in  advance  of  the  test  CD,  but  the 
participants  were  not  required  to  train  their  OCR  systems  on  the  characters  on  this  CD. 
Instead  they  could  use  their  own  training  data,  and  use  the  training  CD  only  to  experiment 
with  the  data  formats  that  would  be  used  on  the  test  CD.  Nevertheless,  many  of  the  partic- 
ipants chose  to  train  on  subsets  of  the  characters  on  the  training  CD  rather  than  on  internal 
databases  that  might  have  given  better  results. 

The  results  of  this  test  should  not  be  used  as  the  basis  for  purchasing  an  OCR  system. 
.Anyone  who  does  base  a purchase  on  these  results  will  probably  encounter  a number  of 
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serious  problems.  Decisions  regarding  applying  an  OCR  system  to  some  specific  task  should 
be  based  on  the  results  of  proctored  tests  with  test  materials  that  axe  typical  of  that  task. 
On  the  other  hand,  the  methodologies  developed  for  this  Conference  and  the  results  obtained 
from  this  Conference  should  prove  quite  useful  in  designing  tests,  both  large  and  small,  to 
support  purchasing  decisions. 


2.1  Organization  of  the  Conference 

The  Conference  was  organized  by  a Committee  consisting  of  the  following  individuals: 

Bob  Hammond,  Robert  Creecy,  and  Norman  W.  Laxsen,  US  Bureau  of  the  Census 
Chaxles  L.  Wilson  and  Jon  Geist,  National  Institute  of  Standards  and  Technology 
Dr.  Jonathan  J.  Hull,  Center  of  Excellence  for  Document  Analysis  And  Recognition 
Dr.  Thomas  P.  Vogl,  Environmental  Research  Institute  of  Michigan 
Dr.  Christopher  J.  C.  Burges,  AT<SzT  Bell  Laboratories 

Jon  Geist,  the  Committee  Chariman,  handled  the  planning  of  the  Conference  and  the  major- 
ity of  the  interaction  with  the  participants.  The  Conference  was  run  for  the  Committee  by 
the  Image  Recognition  Group  (IRG)  at  the  National  Institute  of  Standards  and  Technology 
(NIST)  under  contract  to  the  US  Bureau  of  the  Census.  Bob  Hammond  administered  the 
contract  supporting  this  Conference. 

The  following  individuals  from  the  NIST  IRG  were  instrumental  in  carrying  out  the  work 
of  the  Conference.  Charles  Wilson,  the  Leader  of  the  NIST  IRG,  assured  that  resources 
were  available  when  needed.  Allen  Wilkinson  coordinated  the  technical  activities  of  the 
Conference  including  the  preparation  of  training  and  test  materials,  the  receipt  of  participant 
submissions,  and  software  trouble  shooting.  Stan  Janet  scored  all  of  the  submissions.  Michael 
Garris  designed  and  helped  implement  a new  procedure  for  classifying  the  character  images 
on  the  test  CD  to  take  maximum  advantage  of  the  NIST  IRG  OCR  system  while  still  assuring 
that  every  classification  was  checked  by  a human.  Patrick  Grother  provided  valuable  advice 
on  various  aspects  of  the  Conference  based  on  his  role  as  a participant  representing  the  NIST 
IRG  OCR  system. 

A Call  for  Participation  was  prepared  and  issued  on  behalf  of  the  Committee  as  the  first 
activity  of  the  Conference.  A version  of  the  Call  is  reproduced  in  Appendix  C.  NIST  Special 
Database  3 (SD3)[1]  was  sent  to  the  participants  to  familiarize  them  with  the  test  data 
formats  and  for  possible  use  as  training  data.  A preprint  of  the  documentation  for  that 
database  was  sent  as  instructional  material  at  the  same  time.  NIST  Test  Data  1 (TD1)[2] 
was  sent  to  each  of  the  participants  as  the  test  data.  Instructions  for  the  test  phase  of  the 
Conference  were  sent  with  TDl.  These  are  reproduced  in  Appendix  D. 

Twenty  nine  organizations  agreed  to  participate  in  the  Conference.  Three  organizations: 
HNC,  San  Diego,  CA;  Scan  Optics,  East  Hartford,  CT;  and  the  University  of  Massachusetts 
at  Lowell,  chose  not  to  return  test  results  after  receiving  the  training  and  test  materials.  The 
remaining  twenty  six  participants,  representing  over  45  different  systems,  returned  over  115 
submissions  for  scoring.  These  participants,  the  names  aissigned  by  NIST  to  the  systems  for 
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which  they  submitted  results,  the  type  of  results  submitted,  and  pertinent  references  where 
available  are  summarized  in  Tables  1 and  2.  The  detailed  activities  that  occurred  before  the 
meeting  phase  of  the  Conference  axe  described  in  the  Appendices  mentioned  above. 

Since  this  was  the  first  conference  of  its  kind  being  run  by  the  NIST  IRG,  a number  of 
problems  were  encountered.  These  are  discussed  in  Appendices  A and  B of  this  report.  The 
former  is  a list  of  issues  raised  by  the  participants  during  the  Conference  meeting.  The  latter 
presents,  from  the  NIST  perspective,  a short  list  of  problems  along  with  possible  solutions 
proposed  by  various  individuals.  A short  discussion  is  also  included  for  each  problem,  both 
to  provide  background  information,  and  to  indicate  how  practical  each  proposed  solutions 
appears. 

2.2  Summary  of  Results 

Classification,  rejection,  confidence,  and  error  are  general  ideas  of  importance  in  OCR.  The 
following  definitions  of  these  and  related  terms  will  be  used  throughout  this  report. 

A classification  process  assigns  aji  ASCII  character  to  an  image  of  a character.  The  classifi- 
cation may  be  correct  or  incorrect. 

A rejection  process  divides  a set  of  classifications  into  rejected  classifications  and  accepted 
classifications.  Only  the  accepted  classifications  are  considered  useful. 

Usually,  the  rejection  mechanism  is  applied  after  the  classification  process.  However,  some 
rejection  processes  work  in  parallel  with  the  classification  process,  and  no  classification  is 
assigned  when  a character  is  rejected.  Most  systems  that  carry  out  the  rejection  process 
after  the  classification  process  produce  what  is  called  a confidence  for  each  classification. 
This  is  a number  (usually  between  zero  and  one)  that  orders  the  classifications  according  to 
expected  reliability. 

The  rejection  rate  for  a set  of  character  classifications  is  defined  as  the  ratio  of  the  number  of 
characters  rejected  by  the  rejection  process  to  the  total  number  of  characters  presented  for 
classification.  For  convenience,  this  report  will  refer  to  classifications  that  are  not  rejected  by 
the  classification  process  at  any  given  rejection  rate  as  unrejected  or  accepted  classifications. 

If  a confidence  is  cissociated  with  each  classification,  any  desired  rejection  rate  can  be  obtained 
by  choosing  the  correct  value  for  the  confidence  threshold  and  rejecting  any  classifications 
having  confidences  less  than  or  equal  to  the  threshold  and  accepting  any  classifications  having 
confidences  greater  than  the  threshold. 

The  error  rate  for  a set  of  classifications  is  defined  as  the  fraction  of  the  unrejected  (accepted) 
characters  that  are  classified  incorrectly.  Therefore,  the  error  rate  varies  as  a function  of  the 
rejection  rate. 

Table  3 lists  the  zero-rejection-rate  error  rates  for  all  of  the  OCR  results  submitted  to  the 
Conference.  As  a very  rough  summary,  about  half  of  the  systems  produced  error  rates  of 
less  than  5%  at  zero  rejection  rate  for  digits,  about  half  produced  error  rates  of  less  than 
10%  at  zero  rejection  rate  for  upper  case  letters,  and  about  half  produced  error  rates  of  less 
than  20%  at  zero  rejection  rate  for  lower  case  letters. 
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PARTICIPATING 

ORGANIZATION 

SYSTEM 

DIGIT 

UPPER 

LOWER 

REFERENCES 

AEG  Electrocom  GmbH 

AEG 

X 

X 

X 

Konstanz,  Germany 

Adaptive  Solutions,  Inc. 

ASOL 

X 

X 

X 

[3][4] 

Beaverton,  OR 

AT&T  BeU  Laboratories 

ATT-1 

X 

X 

X 

[5][6][7][8] 

Holmdel,  NJ 

ATT^ 

X 

X 

X 

[5][6][7][8] 

ATT.3 

X 

X 

X 

[9][10] 

ATT. 4 

X 

X 

X 

[5]  [6]  [7]  [8] 

Com  Com  Systems,  Inc. 

COMCOM 

X 

X 

X 

Clearwater,  FL 

ELSAG  BAILEY,  INC. 

ELSAGB.l 

X 

Conshohocken,  PA 

ELSAGBJ2 

X 

ELSAGB.3 

X 

Environmental  Research 

ERIM.l 

X 

X 

X 

[ll|1121[13l[14l 

Institute  of  Michigan 

ERIM_2 

X 

[ll][12][13l(14l 

Ann  Arbor,  MI 

Gesellschaft  fiir 

GMD_1 

X 

X 

X 

[15][16] 

Mathematik  und 

GMDJ> 

X 

X 

X 

[17][18] 

Datenverarbeitung 

GMD-3 

X 

X 

X 

[15][16] 

Sankt  Augustin,  Germany 

GMD.4 

X 

X 

X 

[15][16] 

GTESS  Corporation 

GTESS.l 

X 

X 

X 

Richardson,  TX 

GTESSJ2 

X 

X 

X 

Hughes  Aircraft  Company 

HUGHES.I 

X 

X 

X 

[19] 

Canoga  Park,  CA 

HUGHESJ2 

X 

X 

X 

[19] 

IBM  .Almaden 

IBM 

X 

X 

X 

[20][21][22J 

Research  Center, 

[23)[24]125|[26| 

San  Jose,  CA 

InterFax,  Inc. 

IFAX 

X 

X 

Sunnyvale,  CA 

Kaman  Sciences 

KAMAN  .1 

X 

X 

X 

Corporation 

KAMAN 

X 

X 

X 

Utica,  NY 

KAMAN. 3 

X 

X 

X 

KAMAN. 4 

X 

X 

X 

KAMAN. 5 

X 

X 

X 

Table  1:  List  of  participants,  system  names,  tests,  and  references 
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PARTICIPATING 

ORGANIZATION 

SYSTEM 

DIGIT 

UPPER 

LOWER 

REFERENCES 

Eastman  Kodak  Co. 

KODAK_L 

X 

X 

X 

[5K27K28] 

Rochester,  NY 

K0DAK_2 

X 

[5][271[28] 

Mimetics 

MIME 

X 

X 

[29][30| 

Chatenay  Maiabry, 

France 

Nestor,  Inc. 

NESTOR 

X 

X 

X 

[19|[311132] 

Providence,  RI 

[10l(33l[34l[35] 

[36][37][38] 

National  Institute  of 

NIST.l 

X 

X 

X 

[391 

Standards  and  Tech. 

NIST_2 

X 

X 

X 

[40] 

Gaithersburg,  MD 

NIST.3 

X 

X 

X 

[41] 

NIST.4 

X 

X 

X 

[42] 

NYNEX  Sciences 

NYNEX 

X 

X 

X 

Technology,  Inc. 

White  Plains,  NY 

OCR  SYSTEMS,  Inc. 

OCRSYS 

X 

X 

X 

Huntingdon  Valley,  PA 

Recognition  Equip.  Inc. 

REI 

X 

X 

Dallas,  TX 

Riso  National  Lab. 

RISO 

X 

X 

X 

Roskilde,  Denmark 

Symbus  Technology 

SYMBUS 

X 

X 

Waltham,  MA 

Thinking  Machines 

THINK_1 

X 

Corporation 

THINK_2 

X 

Cambridge,  MA 

University  of  Bologna 

UBOL 

X 

X 

X 

[431[44!(45) 

Bologna,  Italy 

[46l[471[48] 

[49][12][6] 

University  of 

UMICH.l 

X 

X 

Michigan-  Dearborn 

Dearborn,  MI 

University  of  Penn. 

UPENN 

X 

X 

X 

[50][511137) 

Philadelphia,  PA 

[521[5][53) 

[54][55][56][571 

Universidad  Politecnica 

VALEN-l 

X 

X 

X 

[58] 

de  Valencia 

VALEN-2 

X 

[59] 

Valencia,  Spain 

Table  2:  List  of  participants,  system  names,  tests,  and  references 
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Entered 

Percentage  Classification  Error 

System 

Digits 

Uppers 

Lowers 

AEG 

3.43  ± 0.23 

3.74  ± 0.82 

12.74  ± 0.75 

ASOL 

8.91  ± 0.39 

11.16  ± 1.05 

21.25  ± 1.36 

ATTJ 

3.16  ± 0.29 

6.55  ± 0.66 

13.78  ± 0.90 

ATTJ2 

3.67  ± 0.23 

5.63  ± 0.63 

14.06  ± 0.95 

ATTJi 

4.84  ± 0.24 

6.83  ± 0.86 

16.34  ± 1.11 

ATT.4 

4.10  ± 0.16 

5.00  ± 0.79 

14.28  ± 0.98 

COMCOM 

4.56  ± 0.91 

16.94  ± 0.99 

48.00  ± 1.87 

ELSAGB.l 

5.07  ± 0.32 

ELSAGB_2 

3.38  ± 0.20 

ELSAGBJ3 

3.35  ± 0.21 

ERIM.1 

3.88  ± 0.20 

5.18  ± 0.67 

13.79  ± 0.80 

ERIM^ 

3.92  ± 0.24 

GMD.1 

8.73  ± 0.35 

14.04  ± 1.00 

22.54  ± 1.22 

GMD^ 

15.45  ± 0.64 

24.57  ± 0.91 

28.61  ± 1.25 

GMD.3 

8.13  ± 0.39 

14.22  ± 1.09 

20.85  ± 1.25 

GMD.4 

10.16  ± 0.35 

15.85  ± 0.95 

22.54  ± 1.22 

GTESS.l 

6.59  ± 0.18 

8.01  ± 0.59 

17.53  ± 0.75 

GTESS_2 

6.75  ± 0.30 

8.14  ± 0.59 

18.42  ± 1.09 

HUGHES.l 

4.84  ± 0.38 

6.46  ± 0.52 

15.39  ± 1.10 

HUGHES-2 

4.86  ± 0.35 

6.73  ± 0.64 

15.59  ± 1.08 

IBM 

3.49  ± 0.12 

6.41  ± 0.80 

15.42  ± 0.95 

I FAX 

17.07  ± 0.34 

19.60  ± 1.26 

KAMAN.l 

11.46  ± 0.41 

15.03  ± 0.79 

31.11  ± 1.15 

KAMAN.2 

13.38  ± 0.49 

20.74  ± 0.88 

35.11  ± 1.09 

KAMAN.3 

13.13  ± 0.45 

19.78  ± 0.60 

33.55  ± 1.37 

KAMAN.4 

20.72  ± 0.44 

27.28  ± 1.30 

46.25  ± 1.23 

KAMAN.5 

15.13  ± 0.41 

33.95  ± 1.22 

42.20  ± 0.96 

KODAK.1 

4.74  ± 0.37 

6.92  ± 0.78 

14.49  ± 0.77 

KODAKJ> 

4.08  ± 0.26 

MIME 

8.57  ± 0.34 

10.07  ± 0.81 

NESTOR 

4.53  ± 0.20 

5.90  ± 0.68 

15.39  ± 0.90 

NIST.l 

7.74  ± 0.31 

13.85  ± 0.83 

18.58  ± 1.12 

NIST_2 

9.19  ± 0.32 

23.10  ± 0.88 

31.20  ± 1.16 

NIST.3 

9.73  ± 0.29 

16.93  ± 0.90 

20.29  ± 0.99 

NIST.4 

4.97  ± 0.30 

10.37  ± 1.28 

20.01  ± 1.06 

NYNEX 

4.32  ± 0.22 

4.91  ± 0.79 

14.03  ± 0.96 

OCRSYS 

1.56  ± 0.19 

5.73  ± 0.63 

13.70  ± 0.93 

REI 

4.01  ± 0.26 

11.74  ± 0.90 

RISO 

10.55  ± 0.43 

14.14  ± 0.88 

21.72  ± 0.98 

SYMBUS 

4.71  ± 0.38 

7.29  ± 1.07 

THINK.l 

4.89  ± 0.24 

THINKJ2 

3.85  ± 0.33 

UBOL 

4.35  ± 0.20 

6.24  ± 0.66 

15.48  ± 0.81 

UMICH.l 

5.11  ± 0.94 

15.08  ± 0.92 

UPENN 

9.08  ± 0.37 

VALEN-l 

17.95  ± 0.59 

24.18  ± 1.00 

31.60  ± 1.33 

VALENJ2 

15.75  ± 0.32 

Table  3:  Mean  zero-rejection-rate  error  rates  and  standard  deviations  in  percent  calculated 
over  10  partitions  of  TDl. 
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Figures  1 through  6 plot  the  error  rate  versus  the  rejection  rate  for  digits,  for  upper  case 
letters,  and  for  lower  case  letters,  respectively,  for  all  of  the  systems.  The  odd  numbered 
figures  plot  the  error  versus  rejection  curves  that  were  obtained  by  applying  thresholds  to 
the  confidence  data  provided  for  most  of  the  systems.  The  even  number  figures  plot  the 
points  obtained  directly  from  rejection  data  provided  for  the  remaining  systems.  Summaries 
describing  the  individual  systems  and  more  detailed  results  are  included  in  Appendices  E 
and  F. 

Table  3 contains  uncertainties  for  the  error  rates  presented  there.  These  uncertainties  were 
calculated  by  dividing  the  characters  in  Test  Data  1 into  ten  sets  of  characters,  each  set 
being  a contiguous  block  of  the  test  materials.  Each  set  contained  a fixed  number  of  clas- 
sification hypotheses.  Each  set  was  scored  separately  for  each  submission  and  the  mean 
and  sample  standard  deviation  of  those  means  are  recorded  in  the  table.  The  recognition 
error  percentages  are  the  definitive  scores  for  each  conference  participant.  They  are  exact 
performance  measures  over  the  whole  database.  The  standard  deviation  is  a measure  of  the 
change  expected  for  the  given  classifiers  on  alternative  subsets  of  the  test  data.  All  results 
refer  to  zero-rejection-rate  claissification. 

2.3  Differences  between  Test  and  Training  Databases 

Most  participants  pointed  out  that  the  characters  in  NIST  TDl  seemed  different  from  and 
harder  to  recognize  than  those  in  NIST  SD3.  One  participant  suggested  a cross  validation 
data  study.  The  study  is  described  in  detail  in  Section  3.  The  results  suggest  that  TDl 
is  significantly  harder  than  SD3  for  digits,  but  not  significantly  harder,  only  different,  for 
upper  and  lower  case  letters.  A more  definitive  study  and  supplemental  studies  using  other 
systems  seem  warranted. 

A possible  explanation  for  the  different  levels  of  difficulty  for  the  two  sets  of  digit  images  is 
the  different  way  that  the  two  sets  were  obtained.  NIST  SD3  and  NIST  TDl  were  obtained 
by  segmenting  the  characters  filled  out  in  boxes  on  forms  that  were  variations  of  that  shown 
in  Figure  7.  The  forms  for  SD3  were  filled  out  and  returned  by  2100  of  3400  permanent 
Census  field  workers  as  part  of  the  1990  Census  program.  The  forms  for  TDl  were  filled  out 
by  math  and  science  students  in  a high  school  as  a short  exercise  during  class.  The  2100 
Census  workers  who  actually  returned  filled  out  forms  to  their  employer  were  clearly  more 
motivated  than  the  1300  who  did  not,  and  may  be  more  motivated  than  the  500  high  school 
students  who  were  forced  to  fill  out  and  return  the  forms  in  class.  The  attitudes  of  the  high 
school  students  are  probably  more  representative  of  the  attitudes  of  the  general  population 
when  filling  out  a Census  form. 

The  individual  characters  on  the  forms  obtained  from  the  Census  workers  were  isolated  by 
a different  segmenter  than  that  used  with  the  forms  obtained  from  the  high  school  students. 
The  segmentor  used  for  SD3  failed  to  successfully  segment  a much  larger  fraction  of  the 
characters  presented  to  it  than  did  the  segmenter  used  to  create  TDl.  If  segmenters  fail  on 
the  most  difficult  characters,  then  this  could  be  another  reason  why  TDl  appeared  more 
difficult. 
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All  participants  who  used  NIST  SD3  for  training  thought  that  their  error  rate  for  TDl  would 
have  been  lower  had  they  used  a better  training  set.  The  Kodak  and  AEG  systems  both 
demonstrated  this  point  in  different  ways.  The  only  difference  between  the  KODAK.l  and 
KODAKS  system  results  in  Appendix  E is  the  addition  of  sevens  with  crosses  to  SD3  for 
training.  This  one  change  reduced  the  zero-rejection-rate  digit  error  rate  from  4.7%  to  4.1%. 
AEG  used  SD3  for  the  Conference  submission,  but  reran  the  digit  test  after  the  Conference 
using  an  internal  database.  This  reduced  the  zero-rejection-rate  error  rate  from  3.4%  to  2.9%. 
Both  of  these  results  axe  consistent  with  the  cross  validation  results  for  digits  described  in 
Section  3. 

2.4  Error  Rates  as  a Function  of  Rejection  Rates 

A casual  glance  at  the  curves  in  Figures  1 through  6 suggests  that  they  all  have  very  similar 
shapes.  Section  4 derives  the  ideal  error  rate  versus  rejection  rate  curve,  fits  an  empirical 
equation  to  the  data  in  Figures  1,  3,  and  5,  and  derives  the  probability  distributions  that 
generate  the  ideal  and  empirical  error  rate  versus  rejection  rate  curves.  As  discussed  in 
Section  4 there  is  a lot  of  room  for  improvement  in  the  shapes  of  the  error  rate  versus 
rejection  rate  curves  produced  by  the  OCR  systems  before  they  start  to  be  limited  by  the 
ideal  shape.  First  their  slope  at  zero  rejection  rate  is  less  negative  than  the  ideal  slope  at 
zero  rejection  rate,  and  secondly  the  curvature  on  a log  plot  is  in  the  wrong  direction. 

The  last  column  in  Tables  7,  8,  and  9 in  Section  4 lists  the  ratios  of  the  slopes  of  the  error 
rates  at  zero- rejection  rate  to  the  optimum  slopes  for  all  of  the  curves  in  Figures  1,  3,  and  5. 
Notice  that  it  apparently  easy  to  get  the  ratio  of  the  actual  slope  to  the  ideal  slope  greater 
than  30%,  but  quite  difficult  to  get  it  greater  than  80%. 

The  first  10000  digits  of  TDl  were  presented  without  any  indication  as  to  the  correct  clciss 
and  classified  by  a human.  This  process  wa.s  carried  out  at  about  2 characters  per  second 
for  periods  from  one  to  two  hours  with  many  hours  between  fhe  classification  periods.  The 
result  was  a zero-rejection-rate  error  rate  of  1.57%,  which  is  very  close  to  the  digit  rate  for 
all  of  the  digits  in  TDl  for  the  OCRSYS  system.  It  is  very  interesting  that  this  system 
obtained  approximately  the  same  zero-rejection-rate  error  rate  as  a human,  while  producing 
an  error  rate  versus  rejection  rate  that  is  much  less  satisfactory  than  those  for  the  other 
OCR  systems  in  the  Conference. 

2.5  Other  Topics 

Many  different  types  of  classifiers,  feature  extractors,  and  preprocessors  were  used  (including 
no  feature  extractors  or  preprocessors).  The  only  general  conclusion  that  is  evident  so  far  is 
that  good  people  can  make  almost  anything  work.  Section  5 presents  a taxonomy  for  OCR 
systems  and  some  more  detailed  observations  about  the  different  types  of  systems  tested. 

Determining  the  state  of  the  art  with  respect  to  system  speed  was  not  a goal  of  the  Con- 
ference. Nevertheless,  the  participants  were  asked  to  provide  the  time  from  first  CD-ROM 
access  to  last  CD-ROM  access  for  their  system  runs.  Some  participants  provided  these  times. 
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some  provided  times  that  were  associated  only  with  the  actual  OCR  recognition  task,  some 
provided  both,  and  some  provided  some  intermediate  times.  The  results  are  not  meaningful 
enough  to  warrant  any  figures,  but  some  general  conclusions  are  listed  in  Section  6. 


2.6  Conclusion 

Some  preliminary  conclusions  of  the  Conference  are  listed  below. 

The  state  of  the  art  of  machine  OCR  of  segmented,  hand-printed  digits  is  approaching  human 
performance  with  respect  to  the  zero-rejection-rate  error  rate.  The  results  for  upper  case 
letters  and  lower  case  letters  axe  probably  not  as  good  relative  to  human  performance  as  the 
performance  for  digits,  but  no  human  classifications  under  the  conditions  of  the  Conference 
test  have  been  conducted  to  address  question. 

The  digits  in  NIST  SD3  do  not  represent  a heterogeneous  enough  sample  of  hand-printed 
digits  for  optimum  training  of  OCR  systems.  The  same  is  probably  true  for  the  letters  in 
SD3,  but  studies  comparable  to  that  reported  in  Section  3 have  yet  to  be  carried  out. 

The  fact  that  almost  all  of  the  systems  give  similar  shape  error  rate  versus  rejection  rate 
curves  for  the  digit,  upper  case  letter,  and  lower  case  letters  suggests  that  these  curves  come 
about  as  close  as  can  be  expected  to  optimum  with  the  data  in  NIST  TDl.  On  the  other 
hand,  theoretical  studies  suggest  that  there  is  room  for  considerable  improvement.  Further 
research  will  be  needed  to  resolve  this  paradox. 

Many  of  the  participants  indicated  that  the  Conference  was  a useful  learning  experience. 
This  includes  the  NIST  IRC.  Following  the  Conference,  several  simple  changes  were  made 
to  the  NIST-l  system.  These  changes  converted  a K-Nearest  Neighbor  (KNN)  system  to  a 
Probabilistic  Neural  Network  (PNN)  system  using  Karhunen-Loeve  (KL)  features  on  binary 
images  obtained  from  simple  preprocessing,  and  produced  the  NIST_4  system  summarized 
in  Appendix  F.  The  improvement  in  performance, -a  35%  decrease  in  the  zero-rejection-rate 
error  rate  for  digits  from  7.7%  down  to  5.0%,  is  striking.  It  is  noteworthy  that  the  changes 
giving  this  level  of  improvement  were  not  only  easily  implemented,  but  they  were  more  of 
the  nature  of  improvements  in  what  was  already  being  done  rather  than  the  introduction  of 
new  or  different  approaches. 

Most  of  the  participants  and  Committee  members  believe  that  the  results  of  this  Conference 
were  more  than  sufficient  to  justify  a Conference  on  isolated  fields,  but  there  was  less  of  a 
consensus  on  exactly  what  sort  of  isolated  fields  were  appropriate  for  the  next  test.  When 
forced  to  choose  between  two  extremes,  the  participants  overwhelmingly  preferred  digital 
images  of  the  microfilmed  occupation  and  industry  fields  on  real  Census  forms  along  with 
a dictionary  of  allowed  answers  rather  than  artificial  fields  of  random  letters  digitized  from 
forms  such  as  were  used  for  NIST  SD3  and  TDl. 
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DIGITS 


Figure  1;  Error  rate  versus  rejection  rate  for  all  systems  providing  confidence  data  with  their 
classifications  for  the  digit  test. 
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Figure  2:  Error  rate  versus  rejection  rate  for  all  systems  providing  rejection  data  with  their 
classifications  for  the  digit  test. 
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UPPERS 


Figure  3:  Error  rate  versus  rejection  rate  for  all  systems  providing  confidence  data  with  their 
classifications  for  the  upper  case  test. 
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UPPERS 


Figure  4;  Error  rate  versus  rejection  rate  for  all  systems  providing  rejection  data  with  their 
classifications  for  the  upper  case  test. 
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LOWERS 


Figure  5:  Error  rate  versus  rejection  rate  for  all  systems  providing  confidence  data  with  their 
classifications  for  the  lower  case  test. 
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LOWERS 


Figure  6:  Error  rate  versus  rejection  rate  for  all  systems  providing  rejection  data  with  their 
classifications  for  the  lower  case  test. 
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HANDWRITING  SAMPLE  FORM 


DATE 


Of, 


CITY 


STATE  ZIP 


A7/ 


Thii  Mmple  of  handvriluif  » beiag  eeitlf  t.«H  Sot  uae  ia  Uatinf  eooaputer  leeofxuttoo  of  haad  prinUd  auinben 
Mid  kUcfo.  PWm*  print  Uw  CoUowio^  eharKtec*  ia  Uia  bosa  that  ippeat  balow. 


Pbaac  prist  the  foUoeriai  text  ia  the  bos  below: 

We,  the  People  of  the  Uoited  Statea,  ia  order  to  farm  a more  periect  Uaioo,  eetabliah  Jeetice,  insure  domestic 
Ikaa^uiUty,  provide  for  the  eommoa  Defease,  promote  the  general  Welfare,  sad  secure  the  Blaaeiags  of  Liberty  to 
oiiraaiv  aad  our  posterity,  do  ordain  aad  establish  this  CONSTITUTION  far  the  United  States  of  America. 


Figure  7;  A typical  filled-out  sample  form 
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3 Cross  Validation  Studies 


Patrick  J.  Grother 

3.1  Introduction 

Participants  in  the  Conference  agreed  to  classify  unlabelled  images  using  their  own  recog- 
nition systems  and  submit  their  classifications  to  NIST  for  scoring.  NIST  provided  two 
databases  to  all  entrants.  The  first,  SD3,  contained  the  segmented  characters  of  2100  writers 
and  the  “known”  class  files.  This  constituted  an  optional  training  set.  The  second  database, 
TDl,  contained  unlabelled  characters  from  500  writers,  and  it  constituted  the  test  materials. 

One  result  of  the  Conference  Wcis  that  those  recognition  systems  trained  solely  on  the  SD3 
database  generally  displayed  inferior  TDl  recognition  to  those  trained  on  a superset  of  this 
data,  i.e.  one  including  SD3  as  a subset,  or  other,  possibly  proprietary,  datasets.  The  notion 
that  SD3  was  “clean”  or  “constrained”  relative  to  the  TDl  dataset  was  suggested  by  the 
writer  profiles;  SD3  was  obtained  from  motivated  permanent  Census  field  personnel  whereas 
TDl  was  obtained  from  variously  motivated,  more  diverse  and  cosmopolitan  high  school 
students.  An  example  is  that  the  European  crossed  seven  is  far  more  abundant  in  TDl  than 
SD3. 

A study  WcLS  initiated  to  formally  investigate  the  relative  differences  between  the  two  databases 
The  intent  was  to  obtain  some  classifier-independent  measures  of  the  relative  database  dif- 
ficulty - to  obtain  results  that  pertain  to  the  properties  of  the  data,  and  not  the  particular 
recognition  algorithm.  Cross  validation  [60]  [61]  has  long  been  used  as  a method  of  obtaining 
more  “mileage”  from  a data  set.  By  partitioning  the  data  into  disjoint  subsets,  one  for  pa- 
rameter estimation  (i.e.  training)  and  the  other  for  performance  measurement  (i.e.  testing), 
more  robust  estimates  of  performance  statistics  are  available. 


3.2  Theory 

Cross  validation  is  a method  for  accumulating  a statistic,  which  in  this  section  is  the  clas- 
sification error  obtained  using  a nearest  neighbor  classifier.  Moody  [62]  expressed  cross 
validation  in  terms  of  the  mapping  error  associated  between  inputs  and  targets  to  a multi- 
layer perceptron  (MLP),  but  the  concept  of  cross  validation  is  in  no  way  restricted  to  neural 
network  classifiers  or  function  approximators. 

.A.  problem  associated  with  MLP  networks  is  that  patterns  (for  example,  crossed  sevens) 
present  in  a training  set  in  small  numbers  are  only  weakly  represented  by  the  estimated 
weights,  such  that  generalization  is  poor.  Algorithms  that  do  not  aggregate  information 
from  the  training  data  are  not  usually  prone  to  this  problem.  Such  a method  is  the  ubiq- 
uitous K-Nearest  Neighbor  (KNN)  algorithm  [63].  The  distances  of  an  unknown  pattern  to 
elements  of  a prototype  set  are  calculated  using  a suitable  metric  (often  euclidean).  Vot- 
ing between  the  classes  of  the  K-closest  patterns  implies  the  class  of  the  unknown  pattern. 
Numerous  extensions  to  the  scheme  have  been  used  effectively  including  an  elaboration. 
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termed  “Probabilistic  Neural  Network”  (PNN),  developed  by  Donald  Specht  [42]  in  which 
all  prototypes  axe  included  in  a gaussian  distance  weighted  metric.  The  advantage  of  the 
method  is  that  an  a posteriori  probability  is  attached  to  each  possible  class;  the  unknown  is 
classed  as  that  with  the  highest  probability.  NIST  has  used  nearest  neighbor  classifiers  that 
significantly  outperform  MLP  networks  given  identical  features. 


3.3  Classification 

The  first  stage  of  classification  for  purposes  of  cross  validation  between  the  two  sets  used 
the  Karhunen  Loeve  (KL)  expansion  of  the  images  as  a reduced  dimensionality,  optimally 
compact,  representation.  The  use  of  such  features  in  OCR  has  been  described [64]  [39]. 

The  hand  printed  binary  characters  are  isolated  and  represented  a.s  the  ±1  elements  of  a 
column  vector  by  some  consistent  ordering  of  the  rectangular  image.  The  mean  vector  of 
P such  images  is  subtracted  from  each  and  an  ensemble  matrix,  U is  formed  with  the  P 
vectors  a.s  its  columns.  The  symmetric  covariance  matrix,  R,  which  can  be  expressed  as 

R = UU^,  (1) 

gives  the  mean  of  all  the  interpixel  correlations  over  all  images  in  the  ensemble.  This  can  be 
used  to  statistically  describe  how  handprinted  character  images  vary.  The  covariance  matrix 
R has  eigenvectors  as  the  columns  of 

R^  = (2) 

where  the  only  non-zero  elements  of  A are  the  eigenvalues  on  its  diagonal.  The  eigenvectors 
are  the  directions  of  maximum  variance  in  the  image  space  and  form  a complete  orthonor- 
mal set  termed  the  principal  axes  of  a hyperellipse  in  that  space.  The  eigenvalues  define  the 
statistical  “length”  of  these  aoces;  thus  the  first  column  of  ^ corresponding  to  the  largest 
eigenvalue  is  the  major  axis.  The  eigensolution  of  the  covariance  matrix  provides  a variance 
expansion  of  the  image  ensemble  ordered  from  largest  to  smallest  eigenvectors.  The  eigen- 
vectors having  the  smaller  values  and  therefore  describing  very  little  variance  in  the  images, 
are  discarded,  thus  affording  useful  dimensionality  reduction.  Any  image  vector  as  a column 
of  a new  matrix  U is  a linear  superposition  of  the  ba^is  vectors: 

U^r'I'V  (3) 

where  the  inversion  of  this  formula,  V,  defines  the  Karhunen  Loeve  Transform  (KLT)  [65], 
the  elements  of  which  are  the  projection  of  the  image  vector  onto  the  principal  axes: 

(4) 

These  feature  vectors  are  classified  using  the  PNN  nearest  neighbor  technique[42].  Although 
many  variations  are  described,  the  NIST_4  implementation  reported  in  Appendix  F is  as 
follows.  The  square  euclidean  distance  of  an  unknown  pattern,  r,  to  the  prototype  in  the 
training  set,  f,,  is 

(5) 

j=i 
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where  the  subscript  j spans  the  N highest  KL  eigenvectors  retained  for  the  expansion.  The 
distances  d,  are  expressed  as  a function  of  the  standard  deviations  of  normal  distributions 
centered  on  each  of  the  prototypes.  A gaussian  is  applied  as  a kernel  weighting  function. 


9i 


(6) 


The  weighted  distances  are  then  accumulated  by  class  over  the  K classes,  to  which  the 
prototypes  belong. 


p 

^k  = '^g,S,k  (7) 

I 

where  6,^  is  unity  if  the  prototype  is  of  class  k and  zero  otherwise.  Interestingly  this  vector 
may  be  normalized  to  give  a posteriori  probabilities  by  dividing  by  Y^k^k-  The  unknown 
is  assigned  the  class  with  the  largest  attached  probability.  For  optimal  classification  it  is 
necessary  to  survey  over  the  gaussian  width  cr;  for  digits  the  best  value  was  taken  as  3.0, 
whereas  for  uppers  and  lowers  a value  of  4.0  was  adopted. 

Rather  than  use  classifiability  as  a measure  of  database  homogeneity  it  is  possible  to  obtain 
a pnon  measures.  Consider  the  databases  as  image  ensembles  for  which  the  KLT  is  defined. 
The  variances  of  the  transform  coefficients  are  the  eigenvalues.  Since  the  eigenvectors,  the 
basis  of  the  KLT,  form  a complete  orthonormai  set,  any  image  (including  those  of  the  ensem- 
ble from  which  the  covariance  matrix  is  calculated)  is  exactly  a linear  superposition  of  those 
bases.  If  the  eigenvalue  spectrum  is  relatively  flat,  then  the  variance  in  an  image  ensem- 
ble is  distributed  over  mciny  eigenvectors  and  more  eigenvectors  are  needed  for  an  adequate 
representation,  as  for  instance  in  achieving  a low  reconstruction  mean  square  error  level. 

Interestingly,  this  total  image  variance  is  related  to  the  scatter  of  the  data,  5,  defined  as 


S = F;{||u.  - Ujll}  (8) 

where  the  expectation,  E{  } of  the  underlying  distribution  is  replaced  by  the  sample  mean 
whence 


p p 


^ - Uj)  (u.  - Uj) 


.=1 j=i 


^ ^ E E (uru.  + E E i-r-,  + uju,)  - 


P P 


(9) 


(10) 


.=1  j=i 


1=1  ]=\ 


Given  that  the  u are  mutually  independent  and  from  a single  distribution  the  double  sums 
are  replaceable  thus 


r,  P r,  P . P 

5=lEufu.-|i:u;^^E“j 


1=1 


p ^ p ^ 

1=1  J=1 


(11) 
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By  considering  these  inner  products  as  the  traces  of  the  outer  products  of  the  ensemble 
matrices  then  the  scatter  becomes  twice  the  difference  in  the  traces  of  the  autocorrelation 
and  mean  outer  product  matrices.  This  matrix  is  identical  to  the  covariance  matrix  of 
equation  1. 


5 = 2 TracejUU^}  (12) 

The  diagonal  elements  of  the  covariance  matrix  are  the  variances  of  the  image  pixels.  The 
total  variance  is  conserved  under  any  unitary  transformation: 

^R„  = Trace  R = Trace  A (13) 

t 

It  is  found  that  the  scatter  statistic  is  proportional  to  the  sum  of  the  eigenvalues.  Further 
expressing  eigenvalues  as  a percentage  of  their  total  yields  the  percentage  of  the  ensemble 
variance  represented  by  a subset  of  N eigenvectors.  For  comparison  of  the  two  databases 
the  difference  in  the  percentages  as  a function  of  N is  discussed.  Some  eigenfunctions  de- 
scribe image  variance  that  is  relevant  to  classification,  and  others  describe  variation  that  is 
representative  of  noise.  If  an  eigenspectrum  is  wide,  then  the  percentage  variance  described 
by  the  N leading  eigenvectors  will  be  small.  If  the  cross  validation  percentage  recognition  is 
high,  then  the  information  discarded  by  using  an  incomplete  KLT  is  irrelevant  even  though 
there  is  much  of  it.  Alternatively,  if  the  eigenspectrum  is  narrow,  with  much  of  the  variance 
captured,  then  a low  recognition  rate  implies  that  the  discarded  transform  coefficients  are 
valuable.  This  latter  sensitivity  to  the  high-order  KL-transforms  is  undesirable  since  the 
motivation  for  feature  extraction  is  reduced  dimensionality. 


3.4  Cross  Validation  Results 

This  study  generated  a Validation  Comparison  Matrix.  The  matrix  hcis  rank  two  and  dimen- 
sion equal  to  the  number  of  databases  in  the  comparison  which  in  this  case  is  also  two.  The 
row  and  column  indices  of  the  matrix  denote,  respectively,  the  databases  used  for  training 
and  testing.  The  absolute  classification  error  rates  in  the  matrix  are  taken  as  irrelevant 
since  the  entries  were  all  produced  using  the  same  classifier  which  was  not  particularly  well 
optimized.  The  interesting  features  are  the  relative  percentages  discussed  below. 

The  on-diagonal  terms,  c„,  indicate  the  mean  error  rates  for  standard  u-fold  cross  validation 
of  the  database.  The  off-diagonal  elements,  c,j  i ^ j,  result  from  cross-cross  validation. 
The  first  u partitions  of  database  i are  used  as  training  sets  for  the  u-fold  cross  validation  of 
the  database.  In  the  case  of  u-fold  partitioning  of  the  training  set  there  will  be  uv  results 
the  mean  of  which  is  c,j. 

.All  mean  elements  have  an  attached  sample  standard  deviation. 


(14) 

^ ^ 1 = 1 
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If  an  homogeneous  dataset  is  large  enough  then  this  quantity  will  approach  zero.  The 
standard  deviation  is  also  a function  of  the  data  set  redundancy.  For  instance,  consider  a 
database  to  which  a copy  of  itself  is  appended,  and  which  is  classified  with,  for  example,  a 
single  nearest  neighbor  algorithm.  Perfect  recognition  could  then  be  achieved  if,  as  in  the 
cross  validation  scheme  used  here,  the  partitions  are  contiguous  blocks  from  the  dataset. 

The  standard  error  is  accessible  by  dividing  the  standard  deviations  by  a further  \/N  where 
N = 10.  The  discussion  of  the  comparisons  of  the  means  and  the  variances  is  aided  by 
invoking  the  results  of  Student’s  test”  and  the  ”/  test”  (see  for  example  [66]). 

They  are  used  to  assess  whether  two  distributions  have  the  same  mean  and  the  same  vari- 
ances. The  entire  corpus  of  human  hand-printed  characters  may  be  considered  as  one  distri- 
bution of  which  SD3  and  TDl  are  subsets,  but  for  this  study  the  two  sets  are  extracted  from 
different  distributions,  namely  the  two  social  writer  groups  outlined  in  the  introduction.  The 
t-test  quantifies  the  difference  in  two  means  as  a function  of  their  mutual  root  mean  square 
standard  error. 


t 


/ii  - /i2 


zl.  -L  Zx 

N,  ^ iV2 


(15) 


Attached  to  f is  a significance,  0 < p < 1,  giving  the  probability  that  |t|  could  be  at  least 
this  large  by  chance.  That  is  if  p takes  on  a “small”  value  then  the  distributions  have 
significantly  different  means.  Similarly  the  /-test  quantifies  two  variances  as  a ratio  taken 
to  be  greater  than  1 (i.e.  either  (y\l<y\  or  its  reciprocal).  The  value  of  / directly  indicates 
differing  variances.  The  attached  significance,  p is  again  a probability.  Small  values  indicate 
significantly  different  variances. 

The  statistics  are  derived  from  the  two  samples  obtained  by  testing  the  10  partitions  of  SD3 
and  TDl  data  using  one  or  other  training  set.  In  all  cases,  digits,  upper-case  and  lower-case 
letters,  the  calculated  value  of  the  t is  found  with  very  low  significance  indicating  the  mean 
differences  are  not  at  all  spurious.  However  in  no  case  does  the  attached  probability  for  the 
/-test  indicate  that  the  variances  are  significantly  different. 


3.4.1  Digits 

The  handprinted  digits  of  the  first  500  writers  of  SD3  were  partitioned  into  blocks  each 
containing  digits  from  50  writers.  The  numbers  of  characters  in  these  ten  sets  were  not 
identical  but  varied  by  only  0.2%.  The  number  of  SD3  digits  totalled  53449.  The  500 
writers  of  TDl  were  similarly  partitioned.  The  number  of  TDl  digits  totalled  58646.  The 
pure  10  fold  cross  validations  for  SD3  and  TDl  were  obtained  using  the  characters  of  90%  of 
the  writers  as  training  prototypes  with  the  characters  of  the  remaining  10%  used  for  testing. 
The  mean  of  the  zero-rejection-rate  error  rates  for  the  cross  validations  are  quoted  on  the 
diagonal  of  the  Table  below. 

The  first  partition  of  SD3  (450  writers)  was  used  as  prototypes  for  the  classification  of  all 
10  sets  of  characters  of  TDl,  and  vice  versa.  The  off-diagonal  elements  of  the  validation 
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Correct  % 

Test  SD3 

Test  TDl 

± a 

50  writers 

50  writers 

Train  SD3 

450  writers 

1.7%  ± 0.3 

6.8%  ± 0.4 

t = 28.5  p = 0.0 

f = 1.5  p = 0.3 

Train  TDl 

450  writers 

3.5%  ± 0.3 

3.8%  ± 0.5 

t = 1.4  p = 0.2 

f = 2.1  p = 0.3 

Table  4:  Inter  and  Intra  database  Cross  Validation  Recognition  Errors  for  Digits 
comparison  matrix,  so  obtained,  are  also  given  in  the  Table  below. 

The  most  relevant  result  from  this  table  is  that,  using  the  classifier  as  described  above, 
training  solely  on  SD3  implies  a 5%  loss  when  classifying  TDl.  This  is  effectively  NIST’s 
experience  with  its  NIST.l  and  NIST_3  systems  reported  in  Appendix  E. 

The  on-diagonal  elements  of  the  cross  validation  matrix  show  that  SD3  is  a less  diverse  digit 
set  than  TDl.  That  is  the  test  partitions  of  SD3  are  more  like  their  training  sets,  in  the 
nearest  neighbor  sense,  than  is  the  case  with  TDl.  Greater  on-diagonal  terms  indicate  a 
higher  intrinsic  diversity  for  that  database.  If  we  relate  the  low  TDl  classification  to  the 
width  of  the  eigenvalue  spectrum  or  the  volume  of  the  eigenspace,  it  is  apparent  that  TDl 
would  benefit  from  the  use  of  a more  complete  KLT  as  input  to  the  classifier. 

Figure  8 shows  the  eigenspectra  of  the  SD3  and  TDl  characters.  Note  in  particular  that  the 
total  variances  for  the  1024  pixel  images  are  575.5  (SD3)  and  636.8  (TDl)  indicating  that 
TDl  is  absolutely  more  diverse  (larger  scatter).  Approximately  6.6%  more  of  the  variance 
of  SD3  is  described  with  48  KL  eigenvectors  (as  used  by  the  classifier)  than  is  the  case  for 
TDl. 

The  off-diagonal  terms  show  that  use  of  SD3  as  a training  set  for  testing  with  TDl  is  markedly 
inferior  to  use  of  TDl  as  a training  set  for  testing  with  SD3.  The  implication  is  that  TDl 
is  a superset  of  the  SD3  set,  i.e.  TDl  contains  sufficiently  distributed  prototypes  to  classify 
SD3  - whereas  TDl  contains  exemplars  that  are  not  “closely”  present  in  SD3.  That  TDl 
classifies  itself  and  SD3  equally  (to  within  one  standard  deviation)  implies  that  TDl  is  a 
more  general  dataset. 

3.4.2  Uppers 
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Correct  % 

Test  SD3 

Test  TDl 

± cr 

48  writers 

50  writers 

Train  SD3 

432  writers 

14.2%  ± 1.4 

19.4%  ± 1.4 

t = 7.9  p = 0.0 

f = 1.0  p = 0.8 

Train  TDl 

450  writers 

19.3%  ± 1.7 

16.5%  ± 1.4 

t = 3.8  p = 0.0 

f = 1.5  p = 0.4 

Table  5;  Inter  and  Intra  database  Cross  Validation  Recognition  Errors  for  Uppers 

The  handprinted  upper  ca^e  letters  of  the  first  480  ^ writers  of  SD3  were  partitioned  into 
blocks  from  48  writers.  The  upper  case  letters  totalled  10790  examples.  The  500  writers 
of  TDl,  similarly  partitioned,  yielded  11941  characters.  As  in  the  case  of  digits,  there  is  a 
5%  difference  between  the  clcissification  of  SD3  on  itself  and  on  TDl.  Again,  TDl  is  more 
diverse  in  classification  of  itself  than  is  the  case  with  SD3.  On  the  other  hand,  the  total 
variances  axe  734.0  (SD3)  and  650.2  (TDl)  indicating  that  SD3  is  absolutely  more  diverse. 
With  classification  using  96  KL  coefficients  the  percentage  variance  captured  for  SD3  was 
4.8%  less  than  that  for  TDl. 

The  off-diagonal  elements,  however,  are  the  same  indicating  that  neither  set  is  more  general 
than  the  other.  That  the  off-diagonal  elements  are  lower  than  the  on-diagonals  indicates  the 
databases  contain  unique  subsets  that  require  specialist  knowledge  contained  only  in  that 
database. 

3.4.3  Lowers 

The  handprinted  lower  case  letters  of  the  first  490  ^ writers  of  SD3  were  partitioned  into 
blocks  from  49  writers.  The  lower  case  letters  totalled  10968.  The  500  writers  of  TDl, 
similarly  partitioned,  yielded  12000  characters.  As  with  the  upper  case  letters,  but  not 
with  the  digits,  the  total  variances  are  740.4  (SD3)  and  637.9  (TDl)  indicating  that  SD3  is 
absolutely  more  diverse.  With  classification  using  96  KL  coefficients  the  percentage  variance 
captured  for  SD3  was  2.3%  less  than  the  that  for  TDl.  The  cross  validation  matrix  shows 

^None  of  the  upper  case  letters  of  twenty  writers  were  segmented  in  the  preparation  of  SD3.  See  subsection 
3.5. 

^None  of  the  lowercase  letters  of  10  writers  were  segmented  in  the  preparation  of  SD3.  See  subsection 
3.5. 
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Correct  % 

Test  SD3 

Test  TDl 

± a 

49  writers 

50  writers 

Train  SD3 

441  writers 

19.6%  ± 1.4 

23.5%  ± 1.4 

t = 5.9  p = 0.0 

f = 1.1  p = 0.8 

Train  TDl 

450  writers 

25.9%  ± 1.8 

19.2%  ± 1.1 

t = 9.6  p = 0.0 

f = 2.5  p = 0.1 

Table  6:  Inter  and  Intra  databcise  Cross  Validation  Recognition  Errors  for  Lowers 

that  the  lower  case  datasets  are  equally  difficult  and  yet  different  - they  are  insufficiently 
general  to  classify  each  other  as  well  as  they  classify  themselves. 


3.5  Caveats 

3.5.1  Segmentation 

This  initial  study  reports  work  NIST  conducted  immediately  after  the  Conference.  As  such 
it  is  a provisional  investigation  of  database  quality;  it  is  not  experimentally  flawless  and 
therefore  the  conclusion  that  SD3  is  cleaner  than  TDl,  at  least  for  digits,  does  not  necessarily 
apply  to  the  forms  from  which  the  two  databases  of  characters  were  segmented. 

One  reason  for  this  is  that  SD3  and  TDl,  both  obtained  from  fields  of  full  page  forms, 
were  arrived  at  with  different  character  segmenters.  From  a possible  65000  characters  on 
each  500  form  set,  final  numbers  of  human-checked  characters  were  53449  (SD3)  and  58646 
(TDl).  The  SD3  segmentor,  an  old  version,  produced  9%  fewer  isolated  characters  than  the 
updated  model  used  for  TDl.  The  principal  reason  for  failure  of  the  SD3  segmenter  was  the 
inability  to  segment  connected  or  overlapping  characters.  If  the  characters  from  SD3  that 
were  not  segmented  resemble  the  difficult  digit  images  that  apparently  characterize  TDl, 
then  the  difference  between  the  two  databases  may  not  be  writer-letter  dependent  at  all. 
Instead  it  could  represent  writer-connectivity /overlap  differences  between  the  two  different 
writer  groups,  or  the  differences  in  ability  of  the  different  segmenters  to  segment  connected 
or  overlapping  characters  that  tend  to  be  difficult  in  other  ways  as  well. 

This  problem  can  be  negated  by  resegmenting  and  rechecking  the  characters  in  SD3  using 
the  identical  algorithms  applied  to  TDl.  .A  new  database,  a superset  of  SD3,  would  then 
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obtained,  which  could  then  be  used  in  a more  controlled  comparison  with  TDl. 

3.5.2  Classifier  Dependence 

The  eigenvalue  spectrum  describes  the  information  loss  suffered  when  only  the  KL  eigenvec- 
tors having  the  largest  eigenvalues  cire  used  in  classification.  The  classification  of  incomplete 
KLT’s  is  peculiar  in  that  variance  ordered  information  is  discarded.  Using  a much  higher 
number  of  coefficients  in  the  digit  classification  might  equalize  the  on-diagonal  cross  vali- 
dation entries.  Even  with  a higher  dimensional  (but  lower  variance)  KL  space,  aggregated 
functionally  approximated  MLP  classifiers  are  not  able  to  recognize  minority  patterns.  Under 
these  conditions,  the  nearest  neighbor  schemes  do  better. 

Instead  of  using  a “lossy”  incomplete  feature  classifier  it  is  possible  to  use  a full  description 
of  the  image;  the  complete  KL  transform.  Variance  equalization  may  be  more  reasonable 
- choose  the  number  of  KL  features  corresponding  to  either  an  absolute  level  of  described 
variance  or  percentage  thereof.  Thus  in  the  case  of  the  digits  in  SD3,  43  eigenevectors 
describe  75%  of  the  variance,  whereas  to  reach  this  level  with  TDl,  70  KL  coefficients  are 
required.  Alternatively,  features  that  do  not  bias  information  loss  may  be  used.  For  example, 
image  row  and  column  pixel  histograms  or  orthogonal  moments  are  known  to  be  classifiable. 


3.6  Conclusions 

Given  the  experimental  scheme  described,  it  appears  that  TDl  does  contain  a more  diverse 
and  general  digit  set  than  SD3.  The  latter  classified  disjoint  digit  subsets  of  itself  with  a 
5%  lower  error  rate  than  that  with  which  it  classified  TDl.  Furthermore,  TDl  yields  a 
3%  improvement  over  SD3  in  the  classification  of  disjoint  digit  subsets  of  the  former.  The 
hypothesis  that  differing  writer  populations  axe  responsible  for  this  difference  has  not  been 
proven.  Indeed,  the  fact  that  the  cross  validations  for  the  upper  and  lower  case  letters  yield 
very  different  results  seems  to  weaken  this  argument  for  digits. 

Further,  more  controlled  experimentation  is  necessary  and  on-going.  The  a prion  measures 
of  image  variance  are  similarly  inconclusive. 
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Figure  8:  Eigenvalue  vs  Index  for  SD3  and  TDl.  From  top:  digits,  uppers  and  lowers.  All 
writers  were  used. 
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4 System  Error  Rates  Versus  Rejection  Rates 

Jon  Geist  and  R.  Allen  Wilkinson 


4.1  Theory 


Let  q{r)  be  the  probability  as  a function  of  rejection  rate  r that  a rejected  classification  is 
an  incorrect  classification.  In  this  case,  the  error  rate  e(r),  which  is  defined  as  the  ratio 
of  accepted  (unrejected)  classifications  that  are  incorrect  to  the  total  number  of  accepted 
classifications,  is  given  by 

e(0)-/(r) 


e r = 


1 — r 


where 


f{r)  = j q{s)ds, 
Jo 


(17) 


is  the  fraction  of  the  rejected  classifications  as  a function  of  r that  are  actually  incorrect. 
Equations  16  and  17  may  be  combined  to  give  the  slope  of  the  error  rate. 


eir  = 


e(r)  - q{r) 

1 — r 


(18) 


If  e'{r)  is  zero  in  eq.  18  , then 

q{r)  = e{r)  = cq,  (19) 

where  Cq  is  a constant.  This  means  that  the  probability  of  rejecting  an  incorrect  classification 
is  equal  to  the  fraction  of  incorrect  classifications  remaining  in  the  unrejected  sample.  In 
this  case,  the  rejection  mechanism  just  rejects  classifications  at  random. 

If  q{r)  is  equal  to  the  constant  qo  over  some  subrange  of  r,  then 


e(0)  - qpr 
1 — r 


(20) 


and 


over  the  same  subrange. 


e r 


—qo  + e(0) 
(1  -r)2 


(21) 


Therefore,  a perfect  rejection  mechanism  is  characterized  by  q{r)  = 1 for  0 < r < e(0),  and 
qir)  = 0,  for  r > e(0),  in  which  case. 


for  0 < r < e(0),  and 


for  r > e(0). 


e(r) 


e(0)  — r 
1 — r 


e(r)  = 0, 


(22) 

(23) 
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It  is  clear  from  a cursory  investigation  of  Figures  1 through  6 in  Section  2 that  none  of  the 
submissions  to  the  Conference  come  close  to  a perfect  rejection  mechanism,  yet  all  of  their 
e(r)  curves  seem  to  have  similar  shapes.  The  next  section  describes  an  experiment  that  was 
carried  out  to  test  this  observation. 


4.2  Fit  of  Model  to  Experimental  Data 

A visual  examination  of  the  curves  in  Figures  1 through  6 in  Section  2 suggests  that  they 
might  be  well  described  by 

e(r)  = ^ . (24) 

1 — r 

To  test  this  conjecture,  we  fit  the  logarithms  of  the  measured  e(r)  curves  to  the  logarithm 
of  eq.  24  over  the  range  0 < r < 0.14,  where  Cq,  emm,  and  tq  were  adjusted  in  the  fit. 
Logarithms  were  used  to  optimize  the  shape  of  the  fits,  rather  than  the  values  near  the 
maximum  of  e(r). 

The  results  of  the  fits  are  summarized  in  Tables  7,8,  and  9 , which  list  for  the  digit,  the 
upper  case,  and  the  lower  case  test,  respectively,  the  values  of  eo,  emm,  aJid  tq  for  each  system 
that  participated  in  each  test.  These  tables  also  list  the  residual  standard  deviation  of  each 
fit,  and  the  ratio  of  the  actual  value  of  e'(0)  for  each  system  to  the  ideal  value  of  e'(0)  for 
a perfect  rejection  process  for  that  system.  It  is  also  noteworthy  that  it  appears  easy  for  a 
system  to  obtain  a value  of  e'{0)  that  is  greater  than  30%  of  the  ideal  value,  but  it  appears 
very  difficult  for  a system  to  obtain  a value  of  e'(0)  that  is  greater  than  80%  of  the  ideal 
value. 

Eight  data  points  were  used  in  each  fit.  Three  parameters  were  estimated.  This  leaves  five 
degrees  of  freedom  in  each  fit.  Because  the  fits  were  carried  out  on  the  logarithms  of  the 
data,  the  residual  standard  deviations  of  the  fits  are  actually  the  standard  deviations  of  the 
relative  differences  between  the  measured  error  rates  and  those  predicted  by  eq.  24.  Thus  a 
residual  standard  deviation  of  0.01  corresponds  to  a standard  deviation  of  the  relative  errors 
of  the  fit  of  1%  over  the  range  of  the  fit. 

It  is  remakable  how  well  eq.  24  can  be  fit  to  the  e(r)  curves  for  the  various  different  systems 
over  the  0 to  14%  subrange  of  r.  In  fact,  most  of  the  e(r)  curves  are  well  described  by  eq. 
24  over  a subrange  0 < r < r,i , and  by 


e(r)  = e„  (25) 

over  a subrange  < r < r,2,  where  e,,rgi,  and  r,2  are  system  dependent  constants, 
< ^*2,  and  r,2  > 14%. 

4.3  Conclusion 

Equation  25  corresponds  to  the  case  where  the  rejection  process  has  degenerated  to  a random 
sampling  of  the  unrejected  classifications,  as  described  in  connection  with  eq.  19.  Equation 
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24  , on  the  other  hand,  corresponds  to  the  case  where  the  probability  of  rejecting  a classifi- 
cation that  is  actually  incorrect  is  given  by 


/ \ ^0  ^min  / / \ 

q[r)  = ea:p(-r/ro), 

ro 

which  can  be  rewritten  in  terms  of  e(r)  as 

e(r)(l  - r)  - e,„.„ 

q{r)  = 

ro 


and  which  is  bounded  above  by 


q(r)  = e{r)/ro. 


(26) 


(27) 

(28) 


For  most  of  the  systems  in  the  Conference,  e,„,„  -C  e(0).  Therefore,  for  small  r,  eq.  28  is 
a good  approximation  to  to  the  probability  distribution  of  eq.  27  for  most  of  the  systems. 
This  distribution  is  an  improvement  by  a factor  of  I/tq  over  the  probability  distribution  for 
a completely  random  rejection  process,  but  it  is  still  greatly  inferior  to  the  ideal  distribution. 
In  fact,  no  probability  distribution  that  is  proportional  to  e(r)  can  be  efficient,  because  the 
very  act  of  reducing  e(r)  through  the  rejection  process  reduces  the  efficiency  with  which 
incorrect  classifications  are  rejected. 

The  fact  that  eq.  24  describes  all  of  the  e(r)  curves  rea.sonably  well  would  seem  to  suggest 
that  some  limiting  behavior  is  being  approached.  On  the  other  hand,  the  fact  that  there  is 
so  much  difference  between  the  shapes  of  the  e(r)  curves  for  the  systems  in  the  Conference 
and  the  ideal  curve  for  a perfect  rejection  process  suggests  that  there  is  considerable  room 
for  improvement.  This  is  a paradox  that  will  not  be  resolved  without  further  research. 
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SYSTEM 

sigma 

eo 

Cmin 

^0 

ratio 

AEG 

0.028356 

0.034726 

0.001083 

0.052480 

0.585633 

ASOL 

0.031873 

0.092238 

0.000000 

0.203180 

0.241507 

ATT_1 

0.029338 

0.032628 

0.001808 

0.050910 

0.447927 

ATT^ 

0.015908 

0.036287 

0.001320 

0.053330 

0.673430 

ATT_3 

0.072066 

0.050545 

0.007738 

0.048070 

0.666328 

ATT.4 

0.019877 

0.041735 

0.001156 

0.060660 

0.583642 

ERIM.l 

0.020684 

0.039069 

0.000177 

0.059683 

0.634640 

ERIM^ 

0.015108 

0.039497 

0.000880 

0.063479 

0.561363 

GTESS.l 

0.012625 

0.066724 

0.000000 

0.104407 

0.560962 

GTESS_2 

0.006790 

0.067664 

0.002970 

0.102696 

0.592692 

HUGHES-1 

0.028769 

0.050092 

0.000000 

0.084586 

0.462246 

HUGHES-2 

0.029808 

0.049664 

0.000000 

0.090055 

0.519539 

IBM 

0.014380 

0.034863 

0.001592 

0.052278 

0.629884 

IFAX 

0.003250 

0.170347 

0.019570 

0.206207 

0.710072 

KODAKJ 

0.041515 

0.049025 

0.000850 

0.076357 

0.468308 

KODAKS 

0.019072 

0.041296 

0.000600 

0.070753 

0.467999 

NESTOR 

0.016548 

0.045243 

0.002240 

0.064995 

0.609639 

NIST.2 

0.004107 

0.091833 

0.000004 

0.146915 

0.600810 

NIST-3 

0.005298 

0.097302 

0.000017 

0.138602 

0.691096 

NYNEX 

0.024356 

0.044058 

0.002230 

0.067403 

0.540163 

OCRSYS 

0.004160 

0.015539 

0.013377 

0.034779 

0.038724 

THINK.1 

0.009330 

0.049349 

0.001670 

0.072030 

0.586927 

THINKS 

0.019543 

0.038205 

0.002250 

0.053868 

0.717975 

UPENN 

0.003870 

0.090498 

0.000393 

0.148420 

0.587952 

VALEN.1 

0.007752 

0.181158 

0.000001 

0.252546 

0.547717 

VALEN_2 

0.012972 

0.159487 

0.000000 

0.222795 

0.542634 

Table  7:  Parameters  of  fit  over  range  from  0 to  14%  of  model  to  error  versus  rejection  rate 
curves  for  systems  submitting  classifications  and  confidences  values  for  the  digit  test. 
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SYSTEM 

sigma 

Co 

Cfnin 

ro 

ratio 

AEG 

0.017454 

0.038010 

0.004459 

0.053350 

0.527693 

ASOL 

0.012070 

0.113362 

0.000000 

0.248373 

0.282966 

ATT.1 

0.011175 

0.065826 

0.000000 

0.126673 

0.487130 

ATT^ 

0.011336 

0.056636 

0.002750 

0.071867 

0.699079 

ATT_3 

0.032949 

0.070109 

0.000000 

0.104839 

0.530980 

ATT_4 

0.027155 

0.050839 

0.000950 

0.070190 

0.604783 

ERIM_1 

0.020207 

0.051538 

0.000940 

0.083857 

0.600994 

GTESS.l 

0.012625 

0.066724 

0.000000 

0.104407 

0.560962 

GTESS_2 

0.006790 

0.067664 

0.002970 

0.102696 

0.592692 

HUGHES-1 

0.027327 

0.066304 

0.000000 

0.116388 

0.420525 

HUGHES_2 

0.042265 

0.070535 

0.000000 

0.111462 

0.376435 

IBM 

0.005457 

0.064087 

0.001980 

0.085055 

0.700729 

IFAX 

0.001717 

0.195923 

0.017783 

0.236433 

0.697279 

K0DAK_1 

0.023934 

0.070835 

0.000000 

0.104786 

0.561767 

NESTOR 

0.028858 

0.060172 

0.000000 

0.100379 

0.511940 

NIST.2 

0.001366 

0.231152 

0.014857 

0.257046 

0.759154 

NIST_3 

0.010560 

0.170368 

0.000000 

0.205900 

0.790337 

NYNEX 

0.005835 

0.049212 

0.004903 

0.071414 

0.584223 

OCRSYS 

0.000000 

0.057283 

0.010000 

0.085924 

0.006016 

UMICH-1 

0.013518 

0.050340 

0.020245 

0.089622 

0.386150 

VALEN_1 

0.007327 

0.243245 

0.000028 

0.339330 

0.560223 

Table  8:  Parameters  of  fit  over  range  from  0 to  14%  of  model  to  error  versus  rejection  rate 
curves  for  systems  submitting  classifications  and  confidences  values  for  the  upper  case  letter 
test. 
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SYSTEM 

sigma 

Co 

^min 

ro 

ratio 

AEG 

0.018673 

0.129064 

0.000002 

0.197466 

0.549637 

ASOL 

0.006330 

0.212562 

0.070999 

0.219958 

0.519595 

ATT_1 

0.011495 

0.139576 

0.000000 

0.232988 

0.431578 

ATT^ 

0.006518 

0.141519 

0.000000 

0.175623 

0.710256 

ATT.3 

0.006362 

0.163227 

0.000000 

0.229147 

0.733748 

ATT.4 

0.011273 

0.144687 

0.000000 

0.195346 

0.590847 

ERIM.l 

0.006800 

0.137500 

0.000000 

0.185982 

0.793400 

GTESS.l 

0.005090 

0.176155 

0.007665 

0.229252 

0.562496 

GTESS.2 

0.006994 

0.185672 

0.000000 

0.240376 

0.600176 

HUGHES-1 

0.009387 

0.153990 

0.000020 

0.240259 

0.690638 

HUGHES.2 

0.013273 

0.156073 

0.000000 

0.247943 

0.689370 

IBM 

0.004598 

0.154840 

0.000008 

0.206804 

0.674157 

KODAKJ 

0.013933 

0.147603 

0.000000 

0.206858 

0.493963 

NESTOR 

0.010327 

0.155805 

0.000003 

0.207386 

0.582884 

NIST_2 

0.004220 

0.313207 

0.000015 

0.366355 

0.720330 

NIST-3 

0.004278 

0.203856 

0.000000 

0.251538 

0.711280 

NYNEX 

0.008670 

0.140270 

0.000002 

0.219168 

0.581997 

OCRSYS 

0.002186 

0.136838 

0.038749 

0.171823 

0.500069 

UMICH-1 

0.002900 

0.150510 

0.043706 

0.197158 

0.489376 

VALEN-l 

0.003552 

0.317315 

0.000000 

0.438767 

0.514008 

Table  9:  Parameters  of  fit  over  range  from  0 to  14%  of  model  to  error  versus  rejection  rate 
curves  for  systems  submitting  classifications  and  confidences  values  for  the  lower  case  letter 
test. 
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5 Typ  es  of  Algorithms  Used 


Charles  L.  Wilson 


5.1  Rule-based  versus  Machine  learning 

In  the  past  few  years  neural  networks  have  become  important  as  a possible  method  for 
constructing  computer  programs  that  can  solve  problems,  such  as  speech  and  character 
recognition,  where  “human-like”  response  or  artificial  intelhgence  is  needed.  The  most  use- 
ful characteristics  of  neural  networks  are  their  ability  to  learn  from  examples,  their  ability  to 
operate  in  parallel,  and  their  ability  to  perform  well  using  data  that  are  noisy  or  incomplete. 
Many  of  these  characteristics  are  shared  by  various  statistical  pattern  recognition  methods. 
These  characteristics  of  pattern  recognition  systems  are  important  for  solving  real  problems 
from  the  field  of  character  recognition  exemphfied  by  this  report,  as  opposed  to  “toy”  prob- 
lems. The  goal  of  this  section  is  to  summarize  the  different  methods  used  at  the  Census 
OCR  Conference  in  a way  that  will  illustrate  why  neural  networks  and  rule  based  methods 
achieved  the  level  of  performance  that  they  did.  The  various  methods  used  are  summarized 
in  Figure  9 for  classification  and  feature  extraction.  Most  of  the  systems  presented  at  the 
Conference,  but  not  all,  used  separate  methods  of  feature  extraction  and  classification.  In 
the  discussion  presented  here  any  image  processing  which  preceded  the  feature  extraction  is 
combined  with  feature  extraction. 

The  discriminant  function  and  classification  sections  of  the  systems  are  of  two  types:  adaptive 
learning  based  and  rule-based.  The  most  common  approach  to  machine  learning  based 
systems  used  at  the  Conference  was  neural  networks.  The  neural  approach  to  machine 
learning  was  originally  devised  by  Rosenblat  [67]  by  connecting  together  a layer  of  artificial 
neurons  [68]  on  a perceptron  network.  The  weaknesses  which  were  present  in  this  approach 
were  analyzed  by  Minski  and  Papert  [69].  The  results  of  this  Conference  suggest  that  many 
of  these  weaknesses  are  still  important.  The  advent  of  new  methods  for  network  construction 
and  training  during  the  last  ten  years  led  to  rapid  expansions  in  neural  network  research  in 
the  late  1980s.  Many  of  the  methods  referred  to  in  Figure  9 were  developed  in  this  period. 

Adaptive  learning  is  further  subdivided  into  two  types,  supervised  learning  and  self-organization. 
The  material  presented  in  this  report  does  not  cover  the  mathematical  detail  of  these  meth- 
ods, but  the  bibliographic  references  provided  with  many  of  the  systems  discuss  these  meth- 
ods in  detail.  A good  source  of  general  information  on  neural  networks  is  Lippmann’s  review 
[70].  The  primary  research  sources  for  neural  networks  are  available  in  Anderson  and  Rosen- 
feld  [71].  More  detailed  information  on  the  supervised  learning  methods  discussed  here  is 
given  in  [72];  self-organizing  methods  are  discussed  by  Kohonen  [73]  and  Grossberg  [74]. 

The  principal  difference  between  neural  network  methods  and  rule-based  methods  is  that  the 
former  attempts  to  simulate  intelligent  behavior  by  using  adaptive  learning  and  the  latter 
uses  logical  symbol  manipulation.  The  two  most  common  rule-based  approaches  at  the 
Conference  were  those  derived  from  mathematical  image  processing  and  those  derived  from 
statistics.  With  adaptive  learning,  once  the  learning  phase  has  been  completed  the  network 
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response  is  automatic  and  similar  in  nature  to  reflex  responses  in  living  organisms.  The 
processes  where  these  methods  have  been  most  successful  axe  in  areas  where  human  responses 
are  automatic,  such  as  touching  ones  nose  or  recognizing  characters.  With  mathematical 
approaches,  fixed  operations  are  performed  on  individual  images  or  on  statistical  samples  of 
images. 

The  alternate  approach  to  artificial  intelligence  is  rule-based.  Rather  than  teaching  the  pro- 
gram to  differentiate  between  characters,  a rule-based  program  is  constructed  to  distinguish 
among  the  various  characters  by  writing  rules  to  be  followed  by  the  system.  These  are 
explicitly  programmed  in  the  system  in  the  form  of  mathematical  formulas. 

Most  of  the  OCR  implementations  discussed  in  this  report  combine  several  methods  to  carry 
out  preprocessing  (filtering)  and  feature  extraction.  Many  of  the  filtering  methods  used  are 
based  on  methods  described  in  texts  on  image  processing  such  as  [65]  and  on  a method  based 
on  KL  transforms  [39].  In  these  methods,  the  recognition  is  done  using  features  extracted 
from  the  primary  image  by  rule  based  techniques.  The  filtering  and  feature  extraction 
processes  start  with  an  image  of  a character.  The  features  produced  are  then  used  as  the 
input  for  classification. 

In  a self-organizing  method,  such  as  [19],  data  is  applied  directly  to  the  neural  network  and 
any  filtering  is  learned  as  features  are  extracted.  In  a supervised  method,  the  features  are 
extracted  using  either  rule-based  or  adaptive  methods  and  classification  is  carried  out  using 
either  type  of  method.  Systems  with  all  four  possible  combinations  of  rules  and  adaptive 
learning  were  used  at  the  Conference. 

5.2  Statistical  Rules  versus  Mathematical  Rules 

In  Figure  9,  rules  based  on  mathematical  image  processing  are  distinguished  from  rules 
based  on  statistics.  These  two  types  of  rules  are  similar  in  that  they  both  derive  features 
based  on  a model  of  the  images.  Statistical  rules  derive  these  model  parameters  based  on 
the  data  presented.  For  example,  typical  model  parameters  might  be  sample  means  and 
variances.  Mathematical  rules  operate  on  the  data  based  on  external  model  parameters  or 
on  the  specific  data  being  analyzed.  The  model  parameters  might  be  designed  to  detect 
strokes,  curvature,  holes,  or  concave  or  convex  surfaces. 


5.3  Linear  versus  Non-linear  Methods 

All  of  the  methods  shown  in  Figure  9 can  also  be  classed  broadly  into  linear  methods, 
such  as  perceptrons  [67],  and  nonlinear  methods,  such  as  Multi-Layer  Perceptrons  (MLPs) 
[72].  This  separation  into  linear  and  non-linear  algorithms  also  extends  to  mathematical 
and  statistical  methods.  Many  of  the  convolution  and  transform  methods,  such  as  Walsh 
transforms  [75]  or  combinations  of  Gabor  transforms  [28]  are  linear.  Other  method  start  with 
linear  operations  such  as  correlation  matrices  and  become  non-linear  by  removing  information 
with  low  statistical  significance;  KL  transforms  [65]  and  principal  component  analysis  (PCA) 
[64]  are  examples  of  this. 
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5.4  Statistical  and  Neural  Methods 


When  training  data  is  used  to  adjust  statistical  model  parameters  to  train  MLPs,  certain 
methods  may  be  classed  as  either  neural  network  or  statisticaJ  methods.  The  probabilistic 
neural  network  (PNN)  is  an  example  of  this  type  of  method.  In  another  context  PNN  meth- 
ods can  be  regarded  as  one  class  of  a radial  bcisis  function  (RBF)  method.  The  information 
in  Figure  9 classifies  methods  of  this  kind  in  an  arbitrary  way  when  statistical  accumulation 
or  neural  network  models  of  a given  method  are  equivalent. 

5.5  Role  of  Learning  and  Rules  in  Feature  Extraction  and  Clas- 
sification 

The  systems  submitted  for  testing  at  the  Conference  used  all  of  the  four  combinations  of 
rule-based  and  learning-based  feature  extraction  and  classification.  All  possible  combinations 
yielded  at  least  one  low  error  rate  system.  The  most  common  combination  was  the  use  of  a 
mathematically  based  feature  extractor  with  a MLP  classifier.  At  least  one  system  combined 
feature  extraction  with  classification  [6].  One  major  surprise  was  that  linear  methods,  such  as 
Learned  Vector  Quantitatization  (LVQ)  [73]  and  PNN  performed  as  well  as  highly  non-linear 
methods  such  as  MLPs. 

A possible  explanation  for  this  can  be  found  in  Bayesian  models  of  the  learning  and  recog- 
nition process  [76],  [77],  and  [78].  The  relationship  between  testing  error,  Etst  ^.nd  training 
error  Etm  is  given  by: 
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= Etr 


+ 2a^ 


e// 


Peff 
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where  is  the  effective  noise  in  the  network  variables,  p^ff  is  the  effective  number  of 
network  parameters,  and  n is  the  size  of  the  training  sample. 

The  noise  in  the  network  is  learned  from  the  training  sample  and  should  be  similar  for 
all  participants.  Most  participants  achieved  training  errors  of  less  than  0.5%.  The  strong 
similarity  of  accuracy  results  suggest  that  all  of  the  methods  used  maintain  a fixed  ratio  of 
complexity  to  sample  size.  This  would  suggest  that,  in  noisy  samples  of  the  kind  used  in 
the  Conference  tests,  learning  can  not  remove  sample  noise  injected  into  the  classification 
system  from  the  training  data  because  the  excess  complexity  of  the  network  is  used  to  track 
the  noise  in  the  data.  This  is  not  unexpected  since  the  systems  have  no  mechanism  for 
evaluating  “bad”  writing  except  by  statistical  frequency. 
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ure  9:  Types  of  methods  used  for  feature  extraction  and  classification. 
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6 System  Speed 


Charles  L.  Wilson 

Figure  10  shows  the  flow  of  data  through  a typical  page  level  OCR  system.  The  details  of 
the  particular  system  are  discussed  in  [79].  The  tests  run  for  this  Conference  were  conducted 
on  a simplified  problem  in  which  the  characters  were  isolated  and  segmented  prior  to  being 
used  by  the  Conference  participants  so  that  the  only  modules  used  for  Conference  testing  are 
normalization,  filtering/feature  extraction,  recognition,  and  rejection.  The  load  and  store 
modules  are  present  in  either  the  full  system  or  the  simplified  test  system.  This  Conference 
did  not  address  field  isolation  and  character  segmentation. 

Typical  timings  for  a system  of  the  type  shown  in  Figure  10  are  given  in  Table  10.  The 
dominant  times  in  this  table  are  image  loading,  field  isolation,  and  character  segmentation 
times.  In  the  Conference  systems,  field  isolation  and  character  segmentation  times  were  not 
required  so  that  the  dominant  time  for  the  Conference  systems  is  the  image  loading  time.  In 
the  system  summaries  in  Appendices  E and  F,  two  rates  are  listed:  the  total  system  time  and 
the  recognition  time.  In  most  cases,  total  system  rate  is  much  longer  than  recognition  rate. 
This  speed  difference  increases  as  recognition  time  decreases.  Most  systems  have  similar  load 
times  but  recognition  times  vary  by  several  orders  of  magnitude.  The  minimum  recognition 
time  is  less  than  Ims/character.  The  typical  load  time  is  near  lOOms/character.  These 
two  times  place  distinct  bounds  on  system  performance.  The  recognition  rate  of  the  faster 
systems  is  near  the  present  state-of-the-art  for  recognition  performance.  The  system  rate 
is  near  the  typical  speed  that  can  be  achieved  loading  and  decompressing  image  data  on 
common  present-day  desk-top  systems. 

In  order  to  evaluate  the  performance  bounds  of  possible  systems,  some  knowledge  of  both 
algorithmic  complexity  and  the  importance  of  the  algorithm  in  the  overall  system  perfor- 
mance are  needed.  This  can  not  be  accomplished  without  breaking  the  system  into  separate 
components  each  of  which  contains  only  one  dominant  algorithmic  process.  The  importance 
of  the  scaling  of  algorithms  has  been  known  since  the  early  work  on  neural  networks  [69]. 
The  second  factor  which  contributes  to  Table  10  is  the  data  volume  which  each  module 
encounters  during  operation. 

Most  theories  of  numerical  algorithm  complexity  such  as  those  given  in  [80],  [69],  and  [81] 
express  complexity  results  in  notation  of  the  form  0{n^)  where  n is  a measure  of  the  size  of 
a specific  type  of  objects  such  as  n weights,  n pixels,  or  n classes  and  p is  a measure  of  a 
specific  polynomial  complexity.  .As  data  flows  through  the  recognition  process,  n decreases 
very  rapidly.  The  characters  used  for  testing  in  this  Conference  were  scanned  at  six  line  pairs 
per  mm.  For  a 5mm  square  character,  this  results  in  an  input  image  having  3600  pixels.  The 
system  outputs  were  a single  class.  This  reduces  n from  3600  to  one.  In  order  to  separate 
the  0{n^)  effects  from  changes  in  the  size  of  n,  the  exact  proportionality  constant  for  each 
type  of  algorithm  is  important.  A algorithm  working  on  10  data  items  may  still  be  fast 
if  compared  to  an  linear  algorithm  working  on  3600  data  items. 

The  systems  that  were  submitted  to  the  Conference  for  testing  used  a wide  variety  of  hard- 
ware. These  ranged  from  PC’s  to  a Connection  Machine.  Several  types  of  special  purpose 
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systems  were  used.  These  included  VLSI  bcised  hardware  [10]  and  three  kinds  of  massively 
parallel  computer;  Connection  Machine,  Adaptive  Solutions,  and  AMT  DAP.  Several  of  these 
systems  achieved  recognition  rates  over  500  characters/second.  At  these  rates,  all  of  these 
systems  were  limited  by  image  loading  requirements.  While  high  rates  were  achieved  using 
special  hardware,  at  least  one  system  implemented  on  a PC  platform  achieved  comparable 
speeds.  This  was  possible  by  programing  critical  dot  product  routines  as  8-bit  calculations 
in  assembly  language.  The  algorithm  used  was  a MLP  with  the  usual  complexity  for  this 
method  but  the  speed  achieved  was  dominated  by  reducing  the  basic  calculation  time. 

The  speed  measurements  presented  in  this  report  show  that  high  recognition  rates  can  be 
achieved  either  by  using  powerful  hardware  or  by  clever  implementation.  Algorithmic  com- 
plexity cannot  be  separated  from  data  flow  requirements  unless  each  algorithm  is  separated 
from  the  other  system  components  during  testing.  High  speed  systems  are  limited  by  the 
ability  to  provide  them  with  image  data.  None  of  these  variables  has  been  separated  in  the 
data  presented  here.  NIST  has  measured  system  performance  at  the  level  of  detail  required 
to  separate  the  effects  of  the  various  modules  [79]  but  evaluation  of  such  a system  is  much 
more  complex  than  the  bounding  times  given  here.  The  Conference  did  show  that  systems 
on  slow  platforms  or  with  slow  implementations  run  at  less  than  a character  per  second  and 
that  systems  implemented  on  high  speed  hardware  can  run  at  1000  character  per  second  if 
images  can  be  supplied  at  this  rate. 
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Figure  10:  Data  flow  in  a complete  recognition  system. 


COMPONENT 

OVERALL 

PER  FORM 

Load: 

18668.328 

8.889680 

( 58.54%) 

Isolate: 

3669.375 

1.747321 

( 11.51%) 

Segment: 

4773.691 

2.273186 

( 14.97%) 

Normalize: 

854.941 

0.407115 

( 2.68%) 

Filter: 

3013.547 

1.435023 

( 9.45%) 

Recognize; 

250.982 

0.119515 

( 0.79%) 

Reject: 

50.900 

0.024238 

( 0.16%) 

Store: 

609.079 

0.290038 

( 1.91%) 

Total: 

31890.845 

15.186117 

(100.00%) 

Table  10:  System  times  in  seconds  for  2100  forms  on  a parallel  computer. 
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A Issues  Raised  by  Participants 


Christopher  J.  C.  Burges  and  Thomas  P.  Vogl 

This  section  contains  a list  of  issues  that  Conference  participants  raised  during  the  course 
of  their  presentations.  Feedback  from  participants  is  a very  important  part  of  our  effort  to 
make  future  Systems  Conferences  as  effective  and  useful  to  the  community  as  possible.  The 
issues  listed  here  will  be  seriously  considered  in  the  planning  of  the  next  System  Conference. 
Some  of  these  issues  and  possible  problems  anticipated  in  addressing  them  are  described  in 
the  next  appendix.  It  should  be  noted  that  the  following  does  not  represent  a majority  view 
of  participants;  it  is  merely  a list  of  items  that  individual  participants  felt  to  be  important. 

(1)  The  long  range  goal  of  the  enterprise  should  be  “Goal  Directed  Document  Understand- 
ing”. Only  when  the  overall  gocd  is  kept  in  mind  will  we  have  meaningful  end-to-end  perfor- 
mance measures. 

(2)  The  next  Systems  Conference  should  involve  recognition  of  isolated  fields:  strings  of 
digits  and  printed,  unsegmented  words,  perhaps  including  cursive  words. 

(3)  Tests  should  be  run  at  NIST,  or  somehow  proctored  by  NIST,  since  whether  we  like  it 
or  not,  results  will  be  used  extensively  for  marketing  purposes. 

(4)  NIST  should  always  provide  a “submission  received”  message  when  materials  are  received 
from  participants.  This  will  prevent  confusion  in  the  event  that,  for  example,  electronic  mail 
is  lost. 

(5)  Participants  should  be  given  a mini-training  set  long  before  anything  else  happens,  so 
that  they  can  get  their  systems  and  software  in  place  and  ready  to  process  large  amounts 
of  training/test  materials  in  limited  time.  (In  the  test  just  passed,  this  process  ate  into 
the  time  available  for  training).  Sufficient  time  should  be  given  to  participants  so  that  all 
problems  regarding  data  formatting  and  data  exchange  can  be  resolved,  so  that  no  time  need 
be  wasted  in  pursuing  these  issues  at  the  following  Conference. 

(6)  The  NIST  Test  Data  1 should  be  split  (by  NIST)  into  training  and  test  subsets,  so 
that  participants  can  compare  the  performance  of  systems  trained  on  a portion  of  the  test 
database. 

(7)  Two  separate  tests  should  be  performed:  one  in  which  the  test  data  is  taken  from  the 
same  distribution  as  the  training  data,  and  one  in  which  it  is  not.  This  should  be  done 
because  in  some  applications,  the  former  may  hold,  while  in  others,  the  latter  may  hold. 

(8)  Participants  should  be  told  what  part  of  the  country  and  world  the  test  samples  are 
from,  so  that  they  might  take  advantage  of  (learned)  “Handwriting  Dialects”. 

(9)  A writer  index  to  databases  should  be  provided,  since  in  a real-world  application  like 
form  reading,  it  is  a good  bet  that  the  same  writer  filled  out  the  whole  form,  and  some 
systems  might  take  advantage  of  this  fact.  Similarly,  writer  implement  identification  should 
be  given. 

(10)  For  word  recognition,  lexicons  should  be  made  available,  even  if  they  are  as  large  as 
the  English  language.  Nonsense  words  constructed  from  individual  characters  would  not  be 
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a useful  test. 

(11)  Use  of  contextual  information  (in  addition  to  lexical  information)  should  eventually 
be  tested.  For  example,  in  form  recognition,  there  is  often  valuable  contextual  information 
available,  such  as  how  a particular  writer  prints  the  1 and  the  9 in  a date. 

(12)  NIST,  and  other  users,  should  settle  on  a standardized  resolution  for  images  so  that 
results  of  tests  performed  elsewhere  in  the  community  (outside  of  OCR  Systems  Conferences) 
can  be  more  easily  compared. 

( 13)  Single  character  OCR  systems  should  also  be  tested  for  their  rejection  of  NON-characters 
(junk),  since  that  is  extremely  important  when  segmenting  fields. 

(14)  Systems  should  be  allowed  to  classify  an  image  as  ambiguous.  Systems  should  give 
several  top  choice  candidate  answers  for  a given  image.  Such  information  could  be  used  by 
a contextual-analysis  “supersystem” . In  addition,  a system  that  does  not  get  the  right  top 
answer,  but  gets  the  right  answer  in  the  top  few,  should  be  given  credit  over  a system  that 
does  not  get  the  correct  answer  anywhere. 

(15)  A proposal  was  made  for  an  error  rate  metric:  error  rate  = Sum  ( F(character)  x error 
(character)),  where  F(character)  is  the  frequency  of  the  character  in  the  English  language. 
Another  proposal  was  to  use  the  integral  of  the  error  rate  as  a function  of  rejection  rate 
instead  of  the  zero-rejection-rate  error  rate. 

( 16)  Ranking  test  results  by  a single  measurement  was  not  a good  idea;  several  measurements 
should  be  used  to  get  fairer  analyses  (e.g.  raw  recognition  rate,  throughput,  unit  cost, 
punting,  latency  if  any,  flexibility  to  different  applications).  Tests  should  be  done  with  and 
without  both  lexical  and  non-lexical  context,  and  the  scores  for  each  reported. 

(17)  People  who  do  NOT  take  part  in  a Conference  should  still  be  able  to  be  subjected  to 
the  same  kind  of  test  by  NIST,  for  example  by  some  publicly  acknowledged  arrangement  for 
submitting  a request  to  NIST,  getting  test  materials,  and  having  to  return  the  scored,  OCR 
classified  test  materials  within  a fixed,  short  time.  These  “post  conference”  tests  should 
support  the  just-completed  segmented  character  tests  for  some  reasonable  minimum  time 
(say  3 years).  This  would  help  many  who  would  otherwise  not  be  able  to  be  involved,  and 
would  give  a more  accurate  representation  of  the  state  of  the  art  at  any  time. 

(18)  While  it  is  likely  that  in  future  Conferences  all  applicants  will  be  allowed  to  participate 
in  the  testing  and  scoring  aspects  of  the  Conference,  only  those  who  are  willing  to  divulge 
information  about  their  methods  should  be  allowed  to  speak,  or  even  to  attend  the  meeting 
where  other  participants  are  going  to  describe  their  methods. 
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B NIST  Perspective  on  Perceived  Problems 


Jon  Geist 

Some  NIST  staff  members,  some  Conference  Committee  members,  and  some  participants 
pointed  out  and  suggested  solutions  to  problems  that  occured  at  the  first  Conference.  Prob- 
lems of  the  most  concern  to  the  NIST  personnel  running  the  Conference  are  addressed  in 
this  Appendix.  These  axe  not  necessarily  the  problems  of  the  most  concern  to  the  partici- 
pants. The  information  in  this  and  Appendix  A is  included  to  help  in  the  planning  of  future 
Conferences. 

B.l  Perceived  Problem  1 

The  plan  to  make  all  results  public  was  inadequately  formulated  and  inadequately  stated  in 
the  Call  for  Participation. 

B.1.1  Proposed  Solutions 

State  this  aspect  of  the  plan  very  clearly  in  future  Calls  for  Participation. 

B.1.2  Discussion 

From  the  earliest  stages  of  planning  for  the  Conference,  it  appeared  that  the  goals  of  the 
Conference  could  not  be  met  without  disclosing  the  scores  obtained  by  each  system.  Other- 
wise, it  would  not  be  possible  to  ask  specific  questions  about  aspects  of  the  performance  of 
a particular  system.  Therefore,  there  was  at  least  a weak  consensus  among  the  Committee 
to  distribute  all  of  the  scores  for  all  of  the  systems  to  each  participant,  and  to  publish  the 
results  in  a report  that  would  enter  the  public  domain.  During  the  final  preparations  for  the 
Conference  meeting,  the  consensus  grew  in  strength  as  the  problems  with  keeping  the  scores 
confidential  were  brought  clearly  into  focus. 


B.2  Perceived  Problem  2 

The  attempt  to  restrict  the  number  of  participants  was  a mistake. 

B.2.1  Proposed  Solutions 

Open  the  Conferences  to  all  applicants.  If  logistic  considerations  make  it  necessary,  then 
restrict  the  actual  number  of  attendees  at  the  Conference  meeting  rather  than  the  number 
of  participants  in  the  Conference  test. 
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B.2.2  Discussion 


Much  more  was  learned  by  having  26  participants  than  would  have  been  learned  with  only 
15.  It  is  unlikely  that  the  Committee  would  have  chosen  an  optimum  combination  of  15 
participants  from  the  29  applications  it  received.  By  following  a more  relaxed  schedule,  it 
should  be  possible  to  close  the  application  period  before  a room  is  reserved  for  the  meeting. 
If  not,  actual  attendance  at  the  meeting  can  be  limited  based  on  factors  listed  in  item  IS 
in  Appendix  A,  on  the  basis  of  the  scores  obtained,  or  by  further  restricting  the  number  of 
nonparticipant  attendees,  if  necessary. 


B.3  Perceived  Problem  3 

Too  much  time  was  required  of  each  participant  to  prepare  a proposal  to  participate  and  to 
respond  to  requests  from  the  Committee  for  more  information.  In  addition,  the  Commit- 
tee and  NIST  staff  found  it  very  time  consuming  to  abstract  useful  information  from  the 
proposals. 

B.3.1  Proposed  Solutions 

A simple  application  form  requesting  all  of  the  information  desired  by  the  Committee  could 
be  submitted  by  the  participants.  This  form  could  be  included  in  the  Call  for  Participation. 

B.3. 2 Discussion 

The  proposals  did  not  prove  useful  in  choosing  among  the  applicants  for  participation,  and 
the  effort  the  participants  expended  in  preparing  their  proposals  varied  greatly.  Futhermore, 
the  Committee  found  it  necessary  to  solicit  more  information  from  the  participants  in  order 
to  prepare  the  system  summaries  in  Appendix  C. 


B.4  Perceived  Problem  4 

The  test  was  not  proctored. 

B.4.1  Proposed  Solutions 

Possible  solutions  include: 

1)  Publicly  disclose  the  participant’s  scores  in  such  a way  that  no  one  can  identify  a specific 
score  with  a specific  participant. 

2)  Privately  disclose  each  participant’s  scores  only  to  that  participant. 

3)  Set  up  a means  by  which  participants  can  have  their  tests  proctored  if  they  so  desire. 
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B.4.2  Discussion 


The  idea  behind  the  first  and  second  proposed  solutions  is  that  there  would  be  less  motivation 
for  the  participants  to  cheat  under  these  conditions.  However,  this  proposal  effectively 
prevents  open  discussion  at  the  meeting,  so  there  would  be  no  point  to  the  meeting.  The 
Committee  learned  much  at  the  meeting  that  would  not  have  been  learned  otherwise,  and 
many  participants  claimed  that  they  learned  a lot  at  the  meeting.  It  seems  that  undisclosed 
or  secretly  disclosed  scores  without  meetings,  and  meeting  about  openly  disclosed  scores 
are  the  only  practical  alternatives.  Since  one  of  the  purposes  of  a systems  conference  is  to 
stimulate  improvements  in  the  state  of  the  art,  the  first  two  proposed  solutions  do  not  seem 
workable. 

Also,  secretly  disclosed  scores  do  not  really  remove  the  motivation  to  cheat,  but  only  modify 
it.  With  unproctored  tests  and  openly  disclosed  scores,  the  participants  might  be  tempted  to 
supplement  their  results  with  human  classification  in  order  to  get  a better  score,  so  that  they 
could  advertise  that  score  either  to  their  sponsor  to  continue  funding  or  to  potential  customers 
to  encourage  sales.  With  proctored  tests  and  secretly  disclosed  scores,  the  participants  might 
be  tempted  to  lie  about  their  scores  to  their  sponsors  or  their  potential  customers  for  exactly 
the  same  reasons.  This  would  be  possible  because  there  would  be  no  independent  way  for 
anyone  to  verify  that  any  particular  participant  actually  received  the  score  claimed  unless 
the  Committee  were  brought  into  all  such  discussions  in  a police  role.  This  does  not  seem 
practical. 

The  idea  behind  the  third  proposed  solution  is  that  those  participants  who  chose  to  enter  the 
proctored  section  of  the  test  would  be  protected  from  comparison  with  those  who  did  not  by 
the  fact  that  the  latter  were  apparently  afraid  to  be  proctored.  Various  ways  that  proctored 
tests  could  be  conducted  without  requiring  an  undue  amount  of  proctor  time  were  proposed. 
For  instance,  the  tests  could  be  run  on  one  or  a limited  number  of  different  platforms,  and 
they  could  be  submitted  as  executable  code  with  a (yet  to  be  specified)  standard  interface 
to  a (yet  to  be  specified)  central  location  where  the  test  would  be  conducted.  It  remains 
an  open  question  how  practical  this  would  actually  be,  but  the  developement  of  a standard 
interface  for  proctored  tests  might  be  a useful  activity. 


B.5  Perceived  Problem  5 

Some  participants  were  not  as  open  as  they  should  have  been  for  a Conference  of  this  nature. 

B.5.1  Proposed  Solutions 

Possible  solutions  include: 

1)  Request  all  information  required  of  the  participants  in  the  application  form,  and  reject 
any  participants  who  do  not  provide  it. 

2)  Make  open  discussion  a prerequisite  for  attendence  at  the  Conference  meeting,  but  not 
for  participation  in  the  test,  as  discussed  in  item  18  of  Appendix  A. 
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B.5.2  Discussion 


These  solutions  might  reduce  participation,  particularly  the  first,  which  would  reduce  the 
usefulness  of  the  Conference  as  discussed  above  under  Perceived  Problem  2.  The  Committee 
appreciates  the  openness  that  many  participants  showed,  and  hopes  that  their  example  will 
help  other  participants  to  be  more  open  in  the  future. 

B.6  Perceived  Problem  6 

E-mail  did  not  prove  suitable  for  returning  test  results  to  NIST. 

B.6.1  Proposed  Solutions 

Possible  solutions  include: 

1)  Require  all  submissions  on  disk  or  tape. 

2)  Set  up  an  anonymous  FTP  on  a computer  at  NIST  to  receive  the  test  results. 

3)  Set  up  participant  accounts  on  a computer  at  NIST  to  receive  the  test  results. 

B.6. 2 Discussion 

It  is  most  convenient  for  NIST  IRG  personnel  to  receive  test  results  directly  on  a computer 
at  NIST  rather  than  to  have  to  read  a disk  or  tape.  The  people  responsible  for  choosing 
E-mail  did  not  know  that  DARPANET  and  BITNET  E-mail  network  nodes  truncate  E- 
mail  messages  to  100k  or  300k  bytes  and  cannot  handle  the  volume  of  messages  that  can 
be  encountered  from  a number  of  different  participants  all  submitting  their  returns  at  the 
same  time  through  various  network  nodes.  To  use  E-mail,  some  participants  had  to  split 
tarred  and  uuencoded  files  into  190  separate  files  for  submission.  The  E-mail  spooler  on  the 
IRG  network  node  cannot  handle  this  many  messages  at  one  time.  To  solve  this  problem,  a 
software  switch  was  written  to  intercept  Conference  returns,  and  to  redirect  them  to  a large 
buffer.  Unfortunately,  the  NIST  IRG  computers  went  down  over  the  weekend  before  the  last 
day  to  return  the  results.  When  the  computers  were  restored  many  of  the  E-mailed  returns 
were  waiting  at  various  E-mail  nodes,  and  they  all  tried  to  enter  the  NIST  node  at  the  same 
time.  This  caused  the  buffer  to  crash.  .An  anonymous  FTP  is  clearly  the  best  solution. 


B.7  Perceived  Problem  7 

Some  submissions  and  some  corrections  of  errors  in  the  format  or  content  of  CON  and  RJX 
(described  in  Appendices  C and  D files,  that  could  not  be  easily  carried  out  at  NIST,  were 
accepted  and  scored  after  .April  27,  which  was  the  cutoff  date  for  submission  of  test  results. 
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B.7.1  Proposed  Solutions 

Possible  solutions  include: 

1)  Don’t  enforce  any  time  limits. 

2)  Strictly  enforce  a time  limit. 

3)  Provide  all  participants  with  C source  code  for  a package  that  checks  the  structure  of 
the  results  before  they  are  submitted  to  NIST,  and  only  score  those  submissions  that  axe 
received  on  time  at  NIST  and  that  pass  the  same  check  at  NIST. 

4)  List  all  of  the  results  that  were  obtained  on  time  and  in  the  correct  format  in  one  section, 
and  aJl  of  the  other  results  in  a separate  section  to  distinguish  them  in  this  respect. 

B.7.2  Discussion 

The  third  proposed  solution  seems  the  best  compromise  for  addressing  this  problem  for 
future  Conferences.  The  fourth  proposed  solution  was  adopted  for  summarizing  the  results 
of  this  Conference.  That  section  also  contains  a few  results  submitted  after  the  Conference 
meeting  to  address  specific  questions  brought  up  during  the  meeting. 

The  time  limit  was  imposed  mainly  to  assure  that  most  of  the  results  would  be  received  in 
time  for  scoring  at  NIST  before  the  Conference  meeting.  If  the  time  limit  is  not  enforced, 
then  the  participants  will  not  make  the  effort  to  adhere  to  it,  and  it  will  not  serve  its  main 
purpose.  On  the  other  hand,  a number  of  useful  submissions  would  have  been  rejected 
had  the  time  limit  been  strictly  enforced  due  to  the  problems  that  the  participants  had  in 
conforming  to  the  data  formats  specified  for  the  classification  results.  Since  this  was  the  first 
Conference,  and  since  the  E-mail  procedure  for  returning  the  results  was  fraught  with  its  own 
problems,  it  was  decided  to  request  resubmissions  of  lost  or  incomplete  results  submitted  by 
E-mail.  This  led  to  requests  for  resubmission  in  correct  format  of  CON  and  RJX  files  that 
did  not  conform  to  the  specified  formats.  This,  in  turn,  led  to  a toleration  for  results  that 
were  submitted  late,  but  before  all  of  the  CON  and  RJX  were  received  in  correct  format. 
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C The  Call  for  Participation 


CALL  FOR  PARTICIPATION 

FIRST  CENSUS  OPTICAL  CHARACTER  RECOGNITION  SYSTEMS  CONFERENCE 


Februaury  1992  - May  1992 


Sponsored  by: 

US  Bureau  of  the  Census 


Conducted  by: 

National  Institute  of  Standards  and  Technology 

(NIST) 

1.  Background  of  the  CALL  FOR  PARTICIPATION: 

The  Bureau  of  the  Census  has  requested  the  National  Institute 
of  Standards  and  Technology  (NIST)  to  run  a Systems  Conference 
on  Optical  Character  Recognition  (OCR)  of  handprinted  charact- 
ers. One  goal  of  the  Systems  Conference  is  to  give  the  Bureau 
a sense  of  the  current  state-of-the-art  in  OCR  of  hand  printed 
characters  and  the  directions  of  near  term  R&D.  Another  goal 
is  to  provide  a forum  through  which  participants  can  influence 
1)  the  creation  of  large  data  bases  of  hand  printed  characters 
for  uniform  testing  of  OCR  systems,  2)  the  development  of  uni- 
form methods  for  scoring  tests  based  on  such  databases,  and  3) 
the  development  of  standards  for  evaluating  the  performance  of 
OCR  systems . 

A registration  fee  of  about  $50  per  person  will  be  charged  to 
cover  lunches  and  coffees.  All  aspects  of  participation  in 
this  Systems  Conference  will  be  carried  out  at  no  cost  to  the 
Government.  No  contracts  or  grants  will  be  awarded  in 
connection  with,  or  as  a result  of  this  Systems  Conference. 

The  Conference  will  be  run  through  the  First  Census  OCR 
Systems  Conference  Committee  consisting  of  the  following 
members : 

Jon  Geist,  NIST,  chairman 
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Charles  Wilson,  NIST 
Bob  Hammond,  Census  Bureau 
Robert  Creecy,  Census  Bureau 
Tom  Vogl,  ERIM 
Christopher  Burges,  ATT 
Jonathan  Hull,  U.  of  Buffalo 
Norman  Larsen,  Census  Bureau 

2.  Activities  of  the  Systems  Conference: 

2.1  The  Committee  will  review  with  respect  to  the  criteria  stated 
in  Section  6 below  all  applications  for  participation  that 
are  received  at  NIST  before  Close  of  Business  (COB)  on  March 
4,  1992.  The  Committee  reserves  the  right  to  review 
applications  received  after  this  date. 

2.2  The  Committee  will  divide  the  applications  into  two 
categories  with  respect  to  the  criteria  stated  in  Section  6, 
qualified  and  unqualified. 

2.3  If  there  are  more  than  about  fifteen  qualified  applications, 
the  Committee  will  rank  them  according  to  the  criteria  stated 
in  Section  6,  and  will  select  about  fifteen  applications  for 
participation  in  the  Conference.  Otherwise,  all  qualified 
applications  will  be  selected  for  participation. 

2.4  NIST  will  inform  the  participants  of  their  selection  and  will 
send  them  training  materials  on  behalf  of  the  Committee 
before  COB  March  23,  1992. 

2.5  NIST  will  send  test  materials  to  the  participants  on  behalf 
of  the  Committee  before  COB  April  13,  1992. 

2.6  The  participants  will  return  the  completed  test  materials  to 
NIST  before  COB  April  27,  1992. 

2.7  NIST  will  score  the  returned  tests  materials. 

2.8  The  participants  and  the  Committee  will  attend  a meeting  on 
May  27  and  28,  1992.  At  this  meeting 

1)  NIST  will  describe  all  participants’  test  results. 

2)  each  participant  will  present  a 15  minute  talk  outlining 
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the  OCR  approach  used  in  completing  the  test,  and  any 
other  information  that  they  deem  pertinent,  and 

3)  the  Committee  and  the  participants  will  attempt  to  reach 
a consensus  about  what  sort  of  test  materials  should  be 
provided  for  the  Second  Census  OCR  Systems  Conference, 
ajid  what  other  issues  should  be  addressed  to  make  the 
second  Conference  more  useful  to  the  participants. 

3.  Specification  of  Formats: 

3.1  Training  and  Test  materials  supplied  by  NIST: 

Both  the  training  and  test  materials  will  consist  of  digital 
images  of  segmented  numbers,  upper  case  letters,  and  lower 
case  letters  on  an  ISO-9660  formatted  CD  ROM  disc.  For  the 
training  and  test  materials,  the  numbers,  the  upper  case 
letters,  and  the  lower  case  letters  will  be  in  separate 
files.  However,  as  many  as  307,  of  the  letters  in  the  lower 
case  training  files  will  actually  be  little  upper  case 
letters  that  were  printed  when  lower  case  letters  were 
requested.  The  participants  are  requested  to  return  their 
test  results  by  E-Mail,  but  they  may  also  return  them  on  Smm 
magnetic  tape,  or  on  IBM  PC  compatible  5.25  inch  floppy 
disks . 

The  format  for  the  image  data  for  both  the  training  and  test 
materials  will  be  in  the  Multiple  Image  Set  (MIS)  format, 
and  the  format  for  the  classification  data  for  the  training 
materials  will  be  the  Multiple  Feature  Set  (MFS)  format. 
These  formats  are  used  by  the  NIST  Image  Recognition  Group 
for  Standard  Reference  Databases.  More  details  of  the  MIS 
and  MFS  file  formats,  and  the  test  result  formats  are  given 
in  the  Appendix  to  this  document. 

3.2  Test  Results  supplied  by  Participants: 

Participants  will  be  required  to  return  their  test  results 
in  a Classification  file  (HYP)  and  either  a Confidence  file 
(CON)  or  a set  of  Rejection  (RJX)  file,  if  confidence  levels 
are  not  readily  available.  All  of  theses  files,  HYP  and  CON 
or  HYP  and  multiple  RJX,  must  be  in  the  MFS  format.  More 
details  about  the  specifications  for  these  files  are  given 
in  the  Appendix  along  with  the  specifications  of  the  MIS  and 
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MFS  file  formats.  Participants  will  also  be  required  to 
report  the  elapsed  time  for  the  OCR  process  and  minimal 
specifications  of  the  system  used  to  obtain  the  results.  Up 
to  five  different  sets  of  test  results  will  be  accepted  from 
each  participant,  but  the  participants  must  prioritize  the 
results  according  to  the  format  described  in  the  Appendix. 

4.  Application  Format: 

Applications  to  participate  in  the  First  Census  OCR  Systems 
Conference  should  be  no  longer  than  3 pages  of  text,  and 
should  conform  to  the  following  format: 

4.1  The  first  section  should  briefly  describe  the  proposing 
organization.  This  section  should  identify  the  person  who 
will  be  the  point  of  contact  for  the  Systems  Conference, 
including  the  mailing  address,  phone  number,  FAX  number  and 
E-Mail  address,  as  appropriate,  and  up  to  two  other 
attendees . 

4.2  The  second  section  of  the  application  should  state  that  the 
proposing  organization  agrees  to  participate  by  following 
instructions  for  the  training,  testing,  and  meeting  phases  of 
the  Conference,  provided  that  NIST  supplies  the  materials 
before  the  dates  stated  in  Section  2 of  this  CALL,  as 
summarized  below: 

03/04  — deadline  for  receipt  of  application  at  NIST 

03/23  — deadline  for  receipt  of  training  material  from  NIST 

04/13  — deadline  for  receipt  of  testing  material  from  NIST 

04/27  --  deadline  for  return  of  completed  test  to  NIST 

4.3  The  third  section  should  concisely  describe  the  state  of  the 
art  in  OCR  of  handprinted  characters  in  the  proposing 
organization  by  reporting  at  least  one  data  point  for  the 
error  rate  and  the  rejection  rate  for  some  subset  of  NIST 
Special  Database  1 or  some  other  database  of  handprinted 
characters.  The  nature  of  the  database  used  and  exactly 
which  OCR  functions  were  performed  automatically  for  the 
results  presented  should  also  be  indicated. 

4.4  The  fourth  section  should  concisely  outline  the  approach  to 
OCR  used  for  the  results  reported  in  the  third  Section. 
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5.  Submission  of  Applications: 


Applications  should  be  submitted  to: 

Jon  Geist 
B316/225 
ASD/NCSL/NIST 
Gaithersburg,  MD  20899 

Applications  may  be  submitted  by  regular  mail, 

express  mail,  courier  service,  FAX  to  (301)  948-4081,  or 

E-Mail  to  geisttSsed.eeel.nist  .gov  or  geistfimagi.ncsl.nist.gov. 

6.  Rating  Criteria: 

Applications  not  meeting  the  requirements  stated  in  Section  4 
of  this  CALL  may  be  eliminated  from  further  consideration  at 
the  discretion  of  the  Committee.  The  remaining  Applications 
will  be  be  divided  into  qualified  and  unqualified  categories 
on  the  basis  of  sections  4.3  and  4.4.  These  sections  should 
demonstrate  in  a concise  majiner  both  a thorough  understanding 
of  the  basic  ideas  of  OCR  of  handwritten  characters,  aind  a 
state-of-the-art  competence  in  this  area. 

If  more  than  15  qualified  proposals  are  received,  they  will  be 
divided  into  categories  based  on  the  similarity  of  the  OCR 
techniques  described,  and  the  applications  in  each  category 
will  be  ranked  according  to  the  performance  claimed  in  the 
third  section  of  each  application.  The  fifteen  applications 
will  be  selected  by  choosing  the  top  ranked  application  from 
each  category,  then  the  second  ranked,  and  so  forth,  until 
about  15  applications  are  chosen.  The  Committee  reserves  the 
right  to  reject  all  but  one  application  from  a single 
organization . 
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D Instructions  to  Participants 

INDEX 

1.  INTRODUCTION 

2.  DIRECTORY  FORMAT 

3.  RETURN  FILE  FORMATS 

3.1  ASCII  STRING  (ASR)  AND  LINE  (ALR)  REPRESENTATIONS 

3.2  MFS  FILE  FORMAT 

3.3  HYP  FILE  FORMAT 

3.4  RJX  FILE  FORMAT 

3.5  CON  FILE  FORMAT 

4.  TEST  RESULTS  FORMAT 

4.1  E-MAIL 

4.2  EXABYTE  UNIX  8MM  MAGNETIC  MEDIA 

4.3  IBM  PC  1.2  MEGABYTE  5.25"  FLOPPY  DISK 


1.  INTRODUCTION 

This  report  contains  information  about  how  to  return  the  OCR 
test  results  obtained  for  the  MIS  files  in  subdirectory  TESTl 
to  NIST  for  scoring. 

For  the  purposes  of  this  test  you  have  been  assigned  the  site 
name ; 
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2.  DIRECTORY  FORMAT 


Tlie  main  directory  on  the  CD-ROM  called  TEST  DISK  1 has  three 
subdirectories,  DOS,  DOC,  and  TESTl.  DOS  contains  files  that 
are  needed  only  if  the  test  results  will  be  returned  on  an  IBM 
PC  compatible,  1.2  Mbyte,  5.25"  floppy  disk. 

The  actual  test  files  are  stored  in  subdirectory  TESTl,  which 
is  organized  as  follows; 


TESTl 


DIGIT 


I D.OOl.MIS  D. 002. MIS  D_293.MIS 

I 
1 

UPPER 


U.OOl.MIS  U_002.MIS 


U_059.MIS 


LOWER 


L.OOl.MIS  L. 002. MIS  L_059.MIS 


Each  file  except  the  last  in  each  of  the  subdirectories  DIGIT, 
UPPER,  and  LOWER  has  200  images  in  it,  while  the  last  file  has 
less  than  200  images. 

All  of  the  information  on  the  test  file  formats  ajid  prograjns 
to  read  these  file  formats  was  included  on  and/or  with  NIST 
Special  Database  3,  which  was  sent  to  you  as  the  training 
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materials  for  this  test. 


3.  RETURN  FILE  FORMATS 

This  section  explains  the  various  file  formats  for  use  in 
returning  your  OCR  classifications  of  the  MIS  file  images  in 
subdirectory  TESTl  to  NIST  for  scoring. 

3.1  ASCII  STRING  (ASR)  AND  LINE  (ALR)  REPRESENTATIONS 

An  ASCII  String  Representation  (ASR)  is  a buffer  of  variable 
length  containing  any  niimber  of  printable  ASCII  characters, 
where  the  printable  ASCII  characters  include  all  characters 
in  the  hexadecimal  range  20  to  7E. 

An  ASCII  Line  Representation  (ALR)  is  an  ASR  terminated  by 
the  ASCII  LF  character,  hexadecimal  OA.  This  means  that  the 
ASCII  CR  character  OD  cannot  occur  anywhere  in  an  ALR,  or  in 
place  of,  or  in  combination  with  the  ASCII  LF  character  OA  at 
the  end  of  the  ALR. 

3.2  MFS  FILE  FORMAT 

A Multiple  Feature  Set  (MFS)  file  is  a file  of  ALRs . Each 
MFS  file  is  associated  with  a unique  MIS  file.  The  first 
line  of  the  MFS  file  contains  the  ASR  of  a decimal  number, 
which  is  the  number  of  lines  in  the  file  minus  one,  and  which 
is  also  the  number  of  images  in  the  associated  MIS  file.  No 
ASCII  SPACE  characters  are  allowed  in  the  ASR  for  the  first 
line.  Each  line  following  the  first  line  of  an  MFS  file  is 
an  ALR  containing  information  about  the  corresponding  image 
in  the  associated  MIS  file. 


3.3  HYP  FILE  FORMAT 

A Hypothesis  (HYP)  file  is  a file  in  the  MFS  file  format. 
Each  line  following  the  first  line  contains  an  ASR  of  the 
correct  class  assigned  to  the  corresponding  image  in  the 
associated  MIS  file.  The  ASR  in  each  line  consists  of  two 
ASCII  characters.  These  are  the  ASCII  characters  that 
represent  the  hexadecimal  number  that  represents  the  ASCII 
character  of  the  class.  No  space  characters  are  allowed  on 
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any  line  of  this  type  of  file. 

The  name  of  a HYP  file  must  be  the  same  as  the  name  of  the 
associated  MIS  file,  except  that  the  extension  must  be  .HYP. 
For  example,  consider  an  MIS  file  called  ALPHAS. MIS  that 
contains  images  of  the  five  characters  G,  r,  L,  S,  and  w.  An 
ASCII  dump  (that  recognized  the  convention  that  OA  is  the  end 
of  line  marker)  of  the  associated  file  ALPHAS. HYP  would  look 
like : 

5 

47 

72 

4C 

53 

77 

Similarly,  a HEX  dump  of  the  same  CLS  file  would  look  like; 

35  OA  34  37  OA  37  32  OA  34  63  OA  35  33  OA  37  37  OA 

(A  lower  case  "C"  (hex  43  instead  of  hex  63)  would  be  OK.) 

You  will  return  one  HYP  file  for  each  MIS  file  in  the 
subdirectory  TESTl. 


3.4  RJX  FILE  FORMAT 

A Rejection  (RJX)  file  is  a file  in  the  MFS  file  format  in 
which  the  ASR  on  each  line  following  the  first  line  is  an 
ASCII  0 or  an  ASCII  1.  A 1 indicates  that  the  classification 
should  be  scored  as  a Reject  rather  than  as  a Correct  or  an 
Incorrect.  A 0 indicates  that  the  classification  should  be 
scored  Correct  if  identical  with  the  correct  classification 
and  scored  Incorrect  otherwise. 

The  name  of  an  RJX  file  must  be  the  same  as  the  name  of  the 
associated  MIS  file  except  that  it  must  end  in  one  of  the  ten 
extensions  .RJO,  .RJl,  ...,  .RJ8,  .RJ9.  Again  as  an  example, 
consider  the  same  MIS  file  used  for  the  last  example.  An 
ASCII  dump  of  one  of  the  associated  RJX  files,  ALPHAS. RJ3, 
for  instance,  might  look  like: 
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5 

0 

1 

1 

0 

1 

Similarly,  a HEX  dump  of  the  same  CON  file  would  look  like: 

35  OA  30  OA  31  OA  31  OA  30  OA  31  OA 

You  may  use  the  RJX  file  format  to  return  information  on  the 
reliability  of  the  hypothetical  classifications  obtained  from 
your  OCR  system.  This  format  is  useful  if  your  system  does 
not  provide  confidence  levels  or  activations.  Also,  if  your 
system  has  an  accept/rej ect  criterion  that  is  more  complex 
than  setting  a threshold  on  the  highest  confidence  level  or 
activation,  this  is  the  preferred  format.  If  you  choose  to 
use  this  format,  you  should  provide  up  to  ten  RJX  files  for 
each  MIS  file,  and  should  try  to  include  some  rejection  rates 
in  the  range  from  5*/,  to  50*/,. 


3.5  CON  FILE  FORMAT 

A Confidence  (CON)  file  is  a file  in  the  MFS  file  format  in 
which  the  ASR  on  each  line  after  the  first  line  gives  the 
decimal  representation  of  the  confidence  level  (or 
activation)  assigned  to  the  classification  on  the 
corresponding  line  of  the  HYP  file  that  is  associated  with 
the  same  MIS  file.  The  confidence  level  must  be  a number 
ranging  from  0.0  through  1.0.  The  number  of  digits  to  the 
right  of  the  decimal  point  must  be  less  than  17. 

The  name  of  a CON  file  must  be  the  same  as  the  name  of  the 
associated  MIS  file  except  that  the  extension  must  be  .CON. 
For  example,  consider  the  same  MIS  file  used  for  the  last 
example.  An  ASCII  dump  of  the  associated  file  ALPHAS. CON 
might  look  like: 

5 

0.375 

.9 

.7 
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.4 

.8 


Similarly,  a HEX  dximp  of  the  same  CON  file  would  look  like: 
35  OA  30  2E  33  37  35  OA  2E  39  OA  2E  37  OA  2E  34  OA  2E  38  OA 
(Leading  zeros  are  optional.) 

You  may  use  the  CON  file  format  to  return  the  confidence 
levels  assigned  by  your  OCR  system  to  the  hypothetical 
classifications  obtained  from  your  OCR  systems,  provided 
that  such  information  is  available,  and  provided  that  your 
system  makes  it  accept/rej ect  decisions  by  comparing  the 
contents  of  these  files  with  a user  specified  threshold. 


4.  TEST  RESULTS  FORMAT 

This  section  describes  how  the  test  results  are  to  be  returned 
to  NIST.  Three  media  are  supported.  The  preferred  media  is 
E-Mail,  the  next  choice  is  an  Exabyte  UNIX  8mm  Magnetic  Tape, 
and  last  choice  is  an  IBM  PC  compatible  5.25"  floppy  disk. 

No  matter  which  format  is  used,  the  same  directory  tree 
structure  will  be  used  for  organizing  the  test  results.  The 
tree  will  look  as  follows: 

<SITE.NAME> 

I 

test  1 


I I digit 

I I I 

1 lower 

I I 

upper 


u.OOO.hyp  u.OOO.con  u.OOO.rjO  <... 


67 


where  <SITE_NAME>  is  the  name  assigned  to  your  site  in  Section 
1 above,  if  you  are  only  reporting  one  set  of  results.  If  you 
are  reporting  more  than  one  set  of  results  (for  instance, 
results  for  different  systems,  or  results  for  the  saune  system 
that  are  based  on  different  sets  of  training  materials  such  as 
your  proprietary  training  materials  and  NIST  Special  Database 
3,  then  <SITE_NAME>  is  obtained  from  the  name  assigned  your 
site  by  adding  an  underscore  followed  by  a single  digit  1 
through  5 to  your  assigned  site  name.  For  instance  if  your 
assigned  site  name  is  XYZ,  and  you  are  reporting  only  one  set 
of  results,  then  <SITE_NAME>  = XYZ,  but  if  you  are  reporting 
two  sets  of  results,  then  <SITE_NAME>  = XYZ_1  for  the  first 
set,  and  <SITE_NAME>  = XYZ_2  for  the  second  set.  Note  that 
NIST  will  assign  a higher  priority  to  XYZ.l  than  to  XYZ_2  for 
scoring  and  reporting  purposes.  Each  separate  set  of  results 
having  a different  <SITE_NAME>  must  be  sent  on  a separate 
floppy  disk,  on  a separate  8mm  tape,  or  in  a separate  E-mail 
message.  Your  8mm  tapes  (and  floppy  disks,  if  you  want  them 
back)  will  be  returned  at  the  Conference. 

The  next  three  sections  describe  how  to  send  the  test  results 
to  NIST.  Of  the  following  three  options,  E-Mail  is  preferred, 
8mm  tape  is  next,  and  5.25"  floppy  disks  can  be  used  as  a last 
resort.  The  results  of  each  test  are  to  be  sent  only  once 
unless  NIST  requests  that  they  be  resent. 


4.1  E-MAIL 

This  is  the  preferred  way  of  getting  the  test  results  to 
NIST. 

With  a UNIX  system,  the  directory  tree  described  above  will 
be  turned  into  one  file  using  a tar  utility  and  a uuencode 
utility.  The  command  line  to  turn  the  directory  tree  into 
one  file  is  as  follows  and  should  be  run  from  the  directory 
above  <SITE_NAME>: 

tar  -cvf  <SITE_NAME>.tar  . /<SITE_NAME>/test 1 

This  command  will  generate  a file,  <SITE_NAME> . tar , 
containing  everything  in  the  directory  testl. 
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The  file  must  be  uuencoded  to  send  it  by  E-Mail.  To  uuencode 
the  tar  file  use  the  following  command: 

uuencode  <SITE_NAME> . tar  <SITE_NAME> .uu  > <SITE_NAME> . uu 

This  will  create  the  file  <SITE_NAME> .uu  which  can  be  E-mailed 
to  NIST.  The  mail  command  may  vary  from  machine  to  machine. 
Please  be  sure  to  include  the  subject  line 

"testl  results  from  <SITE_NAME>" 

and  to  send  the  results  to  urt<0magi  .ncsl  .nist  .gov . On  a Unix 
machine  this  can  be  done  as  follows: 

mail  -s  "testl  results  from  <SITE_NAME>"  \ 

urtSmagi.ncsl.nist.gov  < <SITE_NAME> . uu 

Refer  to  Section  4.3  to  see  how  to  use  a MS-DOS  based  tar 
utility,  which  will  be  provided  with  the  test  materials,  to 
prepare  the  directory  tree  for  E-Mailing.  Again,  the  mail 
command  will  vary  from  machine  to  machine. 


4.2  EXABYTE  UNIX  8MM  MAGNETIC  MEDIA 

This  option  requires  an  Exabyte  Smm  tape  drive  having  no 
compress  hardware,  and  a machine  running  UNIX. 

The  directory  tree  above  will  be  turned  into  one  file  using  a 
tar  utility.  The  command  line  to  turn  the  directory  tree 
into  one  file  is  as  follows  atnd  should  be  run  from  the 
directory  above  <SITE_NAME>: 

tar  -cvf  <TAPE.DEVICE>  . /<SITE_NAME>/test 1 

where,  <TAPE_DEVICE>  is  the  Smm  tape  drive. 

This  command  will  generate  a tar  tape,  containing  everything 
in  the  directory  testl. 

Send  the  Smm  tape  by  Express  Mail  or  Federal  Express  to: 

R.  Allen  Wilkinson 
Room  A216  TECH 
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NIST/NCSL/ASD/IRG 
Gaithersburg,  MD  20899 


4.3  IBM  PC  1.2  MEGABYTE  5.25"  FLOPPY  DISK 

This  disk  should  be  readable  on  an  machine  running  MS-DOS. 

The  directory  tree  above  will  be  turned  into  one  file  using  a 
public  domain,  MS-DOS  tar  utility,  which  was  tested  and  shown 
to  produce  directory  structures  that  can  be  handled  at  NIST. 
This  public  domain  utility  can  be  found  in  the  DOS  subdirec- 
tory under  the  TESTl  directory  on  the  TEST  DATA  1 CD-ROM. 

The  command  line  to  turn  the  directory  tree  into  one  file  is 
as  follows  and  should  be  run  from  the  directory  above 
<SITE.NAME>: 

tar  -cvf  <SITE_NAME> . tar  . \<SITE_NAME>\test 1 

This  command  will  generate  a file,  <SITE_NAME> . tar , 
containing  everything  in  the  directory  TESTl  only. 

This  file  will  be  too  large  to  put  on  the  floppy.  The  PKware 
softwaxe  that  can  be  found  in  the  DOS  subdirectory  under  the 
TESTl  directory  on  the  TEST  DATA  1 CD-ROM  will  be  used  to 
compress  the  file  into  a size  that  should  fit  onto  the 
floppy.  The  command  to  compress  the  file  is 

pkzip  -ex  <SITE_NAME> . zip  <SITE_NAME> . tax 

This  will  create  a file  <SITE_NAME> . zip  which  will  contain  the 
compressed  version  of  <SITE_NAME> . tar 

Copy  the  file  <SITE_NAME> . zip  to  the  floppy  and  send  it  by 
Express  Mail  or  by  Federal  Express  to: 

R.  Allen  Wilkinson 
Room  A216  TECH 
NIST/NCSL/ASD/IRG 
Gaithersburg,  MD  20899 
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E System  Summaries  For  Results  Submitted  On  Time 


Jon  Geist,  Jonathan  J.  Hull,  Stanley  Janet,  R.  Allen  Wilkinson,  and  Charles  L.  Wilson 

This  appendix  contains  summaries  for  all  system  results  that  were  received  on  time.  The 
first  page  of  each  summary  lists  pertinent  information  about  the  system  such  as  the  type 
of  preprocessing,  the  type  of  feature  extraction,  the  type  of  classification,  and  the  training 
data  used,  whenever  such  information  was  provided  by  the  participants.  This  page  also 
summarizes  the  error  rate  as  a function  of  rejection  rate  and  the  OCR  rate  in  characters  per 
second  (CPS)  for  the  digit,  upper  case,  and  lower  case  tests. 

The  second  page  of  each  system  summary  gives  references  to  pertinent  publications  for  the 
system  and  optional  comments  by  the  participants  where  such  were  provided.  The  DARPA 
Systems  Conferences  upon  which  this  Conference  was  modeled  provide  a page  for  comments, 
so  such  a page  was  provided  here.  Very  few  participants  in  the  Conference  took  advantage 
of  this  page,  and  some  of  those  that  did  used  it  more  for  advertising  than  for  information 
exchange.  Bear  in  mind  that  the  information  given  under  the  heading  COMMENTS  was 
provided  by  the  participants,  and  does  not  necessarily  represent  the  opinions  of  the  Bureau 
of  the  Census,  NIST,  or  the  Committee. 

The  first  graph  on  the  third  page  of  each  system  summary  plots  the  logarithm  of  the  system 
error  rate  versus  the  rejection  rate  for  each  test  (digits  = diamonds,  upper  case  letters  = 
plus  signs,  and  lower  case  letters  = squares)  for  which  results  were  submitted. 

The  second  graph  on  the  third  page  of  each  summary  is  a little  more  difficult  to  explain. 
The  abscissa  of  this  graph  is  the  zero-rejection-rate  error  rate  for  all  of  the  test  characters 
produced  by  a single  writer  for  a given  test  (digits,  upper  case  letters,  lower  case  letters). 
The  ordinate  of  this  graph  is  the  number  of  writers  for  which  the  single- writer  zero-rejection- 
rate  error  rate  is  less  than  the  percentage  given  on  the  abscissa.  Again  there  is  one  curve  for 
each  test  for  which  results  were  submitted.  The  three  curves  for  digits,  upper  case  letters, 
and  lower  case  letters  are  not  labeled,  but  they  are  readily  distinguished.  The  curves  for  the 
upper  and  lower  case  letters  are  characterized  by  large  steps  near  4 and  8%  rejection  rate, 
corresponding  to  one  and  two  incorrect  characters  out  of  a maximum  of  26  letters  per  writer. 
The  rounding  of  these  steps  is  caused  by  the  fact  not  all  writers  are  represented  by  all  26 
upper  or  lower  case  letters.  Some  letters  were  lost  as  a result  of  segmentation  errors.  The 
lower  (upper)  of  the  two  curves  with  the  large  steps  is  always  the  curve  for  the  upper  (lower) 
case  letters.  The  curve  for  digits  has  much  smaller  steps  because  there  are  many  more  digits 
(a  maximum  of  130)  per  writer  than  letters  per  writer. 

The  fourth  through  sixth  pages  of  each  summary  contain  three  pseudo-correlation  graphs. 
These  show  the  correlations  between  the  classifications  produced  by  the  system  in  question 
and  the  classifications  produced  by  all  of  the  other  systems.  The  plus  and  minus  signs  in 
the  graphs  report  two  different  correlation  measures,  whereas  the  continuous  lines  are  for 
reference  purposes.  These  graphs  are  also  somewhat  difficult  to  explain. 

System  number  1 in  each  graph  is  the  system  that  is  the  subject  of  the  particular  system 
summary  being  read.  Each  plus  sign  reports  the  ratio  of  the  zero-rejection-rate  classifications 
that  were  identical  for  system  number  1 and  for  the  system  corresponding  to  the  number  on 
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the  abscissa  to  the  total  number  of  characters  to  be  classified  . Each  minus  sign  reports  the 
ratio  of  the  zero-rejection-rate  classifications  that  were  correct  for  system  number  1 and  for 
the  system  corresponding  to  the  number  on  the  abscissa  to  the  total  number  of  characters  to 
be  classified.  The  systems  are  ordered  and  numbered  along  the  abscissa  according  to  their 
plus-sign  pseudo-correlation  with  system  number  1.  This  meajis  that  the  ordering  could 
be  different  for  the  digit,  upper  case  letter,  and  lower  case  letter  tests  within  every  system 
summary  and  between  system  summaries.  Therefore,  a key  to  the  numbers  on  the  abscissa 
and  the  correlation  data  is  provided  for  each  graph  on  the  same  page  of  the  summary.  The 
key  also  contains  the  numerical  values  for  the  pseudo  correlations. 

The  upper  continuous  line  in  the  pseudo-correlation  graphs  is  the  zero-rejection-rate  accuracy 
rate  (one  minus  the  error  rate)  for  each  of  the  systems  listed  along  the  abscissa.  The  lower 
continuous  line  is  the  upper  continuous  line  displaced  downward  by  the  zero-rejection-rate 
error  rate  for  system  number  1,  the  system  in  question.  The  lower  and  upper  lines  are  lower 
and  upper  bounds,  respectively,  for  the  minus  signs.  The  minus  signs  are  lower  bounds  for 
the  plus  signs. 

The  pseudo-correlation  graphs  are  useful  for  determining  which  systems  might  be  used  to- 
gether to  produce  a lower  error  rate  than  either  system  alone.  For  example,  there  is  little 
use  to  combining  the  two  HUGHES_1  and  HUGHES_2  systems,  which  produced  virtually 
identical  zero-rejection-rate  error  rates,  because  they  are  so  strongly  correlated.  On  the 
other  hand,  the  U_PENN  and  NIST_2  systems  also  produce  virtually  identical  results,  but 
are  much  less  strongly  correlated.  Therefore,  combining  their  results  might  give  a better 
system,  at  least  as  a function  of  rejection  rate,  if  not  for  the  zero-rejection-rate  error  rate. 

The  List  of  Figures  and  the  List  of  Tables  at  the  beginning  of  the  Report  following  the  Table 
of  Contents  can  be  used  as  an  index  of  the  system  summaries  given  in  this  .Appendix. 
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SYSTEM:  AEG 


PARTICIPANT:  Juergen  Franke 

ORGANIZATION:  AEG  Electrocom  GmbH,  Konstanz,  Germany 
PREPROCESSING:  normalization  for  size,  stroke  width,  and  slant 
FEATURES:  KL  transform  into  256  features 

CLASSIFICATION:  adaptive  statistical  polynomial  classification 
(POLYFONT) 


HARDWARE:  VAX  6510  without  6510  vector  processor 

TRAINING:  DIGITS  UPPERS  LOWERS  DATABASE 

all  all  all  NSDB3 

STATUS:  on  time 


RESULTS:  — DIGITS  --  — UPPERS  — — LOWERS  — DATABASE 


REJ. 

ERR. 

REJ. 

ERR. 

REJ. 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE 

RATE— 

RATE 

RATE— 

0.00 

0.0343 

0.00 

0.0374 

0.00 

0.1274 

0.10 

0 . 0067 

0.10 

0.0107 

0.10 

0.0876 

0.20 

0.0029 

0.20 

0.0053 

0.20 

0.0562 

0.30 

0.0029 

0.30 

0 . 0047 

0.30 

0.0358 

0.40 

0.0031 

0.40 

0 . 0042 

0.40 

0.0249 

0.50 

0.0032 

0.50 

0.0042 

0.50 

0.0237 

OCR  RATE  (CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE:  not  much  lower  than  CPU  rate 

CPU  RATE:  33.70  4.42  3.77 

(about  10  times  faster  with  6510  vector  processor) 
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SYSTEM:  AEG 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 
none 

COMMENTS:  AEG 
COMPANY  CAPABILITIES: 

AEG  Electrocom  GmbH  is  a Constance  based  subsidiary  of  the  AEG  Group.  AEG  represents  one  of 
the  four  main  branches  of  the  Daimler  Benz  Group.  At  AEG  Electrocom  currently  approximately 
1400  employees  are  responsible  for  an  annual  turnover  of  approximately  250  million  DM. 

AEG  Electrcom’s  mission  is  to  qualify  as  an  efficient  partner  for  high  tech  systems  in  automation, 
information  technology  and  communications  with  precision  mechanics,  advanced  electronics  and 
customer  specific  software.  AEG  Electrocom  is  sharing  R&D  efforts  for  character  recognition  with 
the  Daimler  Benz  Research  Institute  at  Ulm,  Germany. 

The  product  range  includes  - Letter  sorting  systems  - Parcel  and  flat  sorters  - Recognition  systems: 
various  form  readers,  reading  electronics,  scanners  for  OCR/ICR  appUcations 

Today,  AEG  is  successfully  addressing  the  US  market  with  solutions  for  address  and  form  reading 
(including  hand  print).  AEG  is  represented  in  the  US  market  by  our  subsidiary:  AEG  Washinton 
1350  Connecticut  Avenue  NW  Washington,  DC  20036  Phone:  (202)  835-2003  FAX  : (202)  835-2022 

STATE  OF  THE  ART  IN  OCR  OF  HANDPRINTED  CHARACTERS 

AEG  Electrocom  has  sold  world-wide  many  thousands  of  systems  for  postal  address  reading  and 
forms  reading  apphcations. 

AEG’S  CHARACTER  RECOGNITION  TECHNOLOGY 

AEG’s  ICR  technology,  called  POLYFONT,  is  based  on  a mathematical  statistical  approach  and 
applies  a polynomila  classifier  for  the  recognition  task.  The  basis  for  the  recognition  process  is  a 
bit-map  of  the  characters  to  be  recognized.  On  this  bit  map,  a primary  segment  calculation  (black 
connected  components)  is  applied.  Primary  segments  are  clustered  together  into  compound  objects 
which  reflect  single  characters.  These  compound  objects  are  normalized  into  a matrix.  This  matrix 
is  represented  afterwards  by  a vector  with  256  dimensions.  The  vector  is  fed  into  the  classifier. 
The  classifier  will  produce  a confidence  level  indicating  the  probability  to  which  an  input  pattern 
does  belong  to  a shape  class  which  is  stored  in  the  classifier.  The  shape  classes,  which  resemble  the 
’’typical”  representation  of  a character  to  be  recognized,  are  determined  during  the  training  of  the 
classifier. 

The  core  recognition  algorithms  are  similar  for  the  complete  AEG  product  range.  Performance 
of  the  different  products  varies  in  throughput  and  in  the  methods  for  image  handling  and  prepro- 
cessing, as  well  as  for  the  determination  of  the  actual  meaning  of  a shape  class  assigned  by  the 
recognizer.  These  processes  are  combined  from  a large  toolbox  to  match  customer  requirements. 
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ERROR  RATE  (%) 


AEG  — DIGITS  UPPERS  LOWERS 


REJECTION  RATE  (%) 


Figure  11:  Error  rate  versus  rejection  rate  for  AEG 
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Figure  12:  Error  rate  per  writer  of  AEG 
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aegjmgit.correlate 


SYSTEM  NUMBER 

Figure  13:  AEG  - digit  correlation 


Sy«iem  Number 

System  Name 

Correlaljon  ( &li ) 

Correlation  (correct) 

1 

AEG 

1 0000 

1 0000 

2 

VOTE-M 

0 9781 

0.9606 

3 

ERIM-1 

0 9674 

0 9492 

4 

VOTE_P 

0 9659 

0 9534 

5 

REFERENCE 

0 9657 

0 9657 

6 

OCRSYS 

0 9652 

0 9586 

7 

ELS  AGB_3 

0 9645 

0 9502 

8 

ELSAGB^ 

0 9642 

0 9499 

9 

UBOL 

0 9632 

0 9446 

10 

ERIM-2 

0 9616 

0 9465 

11 

ATT^ 

0 9606 

0 9469 

12 

K O D A K _2 

0 9605 

0 9448 

13 

ATT_4 

0 9605 

0 9446 

M 

ATTJ 

0 9598 

0 9487 

15 

NIST.4 

0 9598 

0 9397 

16 

IBM 

0 9578 

0.9464 

17 

ELSAGB.l 

0 9555 

0 9378 

18 

SYMBUS 

0 9536 

0 9388 

19 

K O D A K U 

0,9536 

0 9383 

20 

ATTJ 

0 9533 

0 9384 

21 

HUGHES. 1 

0 9526 

0 9378 

22 

THINK  J 

0 9523 

0 9420 

23 

HUGHES. 2 

0 9520 

0 9375 

24 

NESTOR 

0 9516 

0 9389 

25 

THINK.l 

0 9496 

0 9356 

26 

REI 

0,9466 

0 9389 

27 

NYNEX 

0 9454 

0 9363 

28 

GTESS.l 

0 9352 

0 9203 

29 

GTESS  J 

0 9346 

0 9193 

30 

COMCOM 

0 9314 

0,9286 

31 

NIST.l 

0 9247 

0 9095 

32 

GMD_3 

0 9239 

0 9075 

33 

MIME 

0 9155 

0 9011 

34 

GMD-1 

0 9152 

0.9005 

35 

ASOL 

0 9148 

0 8996 

36 

UPENN 

0 9124 

0 8974 

37 

NISTJ 

0 9124 

0 8968 

38 

NISTJ 

0 9096 

0 8928 

39 

GMD-1 

0 8998 

0 8858 

40 

RISC 

0 8990 

0 8828 

41 

KAMAN.l 

0,8937 

0 8762 

42 

K AMAN.3 

0 8789 

0 8608 

43 

KAMAN.2 

0 8768 

0 8587 

44 

K AMAN-5 

0 8540 

0 8391 

45 

GMD.2 

0.8525 

0 8369 

46 

VALEN J 

0 8427 

0 8315 

47 

IFAX 

0.8336 

0 8194 

48 

VALEN.l 

0 8229 

0 8093 

49 

KAMAN_4 

0 8043 

0 7867 

Table  11:  AEG  correlation  graph  key  for  digits. 
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AEC-UPPERCORRELATE 


SYSTEM  NUMBER 


Figure  14:  AEG  - upper  case  correlation 


System  Number 

System  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

aTg 

1 0000 

1,0000 

2 

VOTEJ^ 

0 9680 

0 9518 

3 

REFERENCE 

0 9626 

0 9626 

4 

ERIM_1 

0 9477 

0 9323 

S 

ATT.4 

0 9475 

0.9330 

6 

UMICH.1 

0 9450 

0 9313 

7 

NYNEX 

0 9440 

0 9312 

8 

UBOL 

0 9433 

0 9246 

9 

ATT.2 

0 9384 

0 9252 

10 

VOTE-P 

0 9374 

0 9270 

11 

NESTOR 

0 9374 

0.9238 

12 

HUGHES-1 

0 9348 

0.9198 

13 

HUGHES.2 

0 9324 

0 9173 

14 

ATTJ 

0 9322 

0 9170 

15 

ATT.1 

0 9305 

0,9172 

16 

KODAK-1 

0 9302 

0 9158 

17 

IBM 

0 9295 

0,9173 

18 

SYMBUS 

0 9268 

0 9126 

19 

OCRSYS 

0 9205 

0 9091 

20 

GTESS-I 

0 9189 

0 9042 

21 

GTESS  J 

0 9172 

0 9024 

22 

MIME 

0 8977 

0 8841 

23 

NIST-4 

0 8972 

0 8820 

24 

ASOL 

0 8885 

0 8740 

25 

REI 

0 8771 

0 8658 

26 

GMD-l 

0 8601 

0 8467 

27 

RISO 

0 8587 

0 8446 

28 

NIST.l 

0 8585 

0 8457 

29 

GMD.3 

0.8573 

0 8444 

30 

KAMAN.l 

0 8478 

0 8354 

31 

GMD,4 

0 8409 

0 8282 

32 

COMCOM 

0 8222 

0 8168 

33 

NIST-J 

0 8196 

0 8125 

34 

KAMAN-3 

0 8007 

0 7892 

35 

IFAX 

0 8007 

0 7889 

36 

KAMAN.2 

0 7922 

0 7802 

37 

NISTJ 

0 7667 

0 7561 

38 

VALEN.1 

0 7572 

0.7450 

39 

GMD-2 

0,7542 

0.7424 

40 

KAMAN.4 

0.7290 

0 7169 

41 

KAMAN-5 

0 6585 

0 6491 

42 

UMICH_2 

0 0337 

0.0228 

Table  12:  AEG  correlation  graph  key  for  uppers. 


AEaLOWERXOWIELATE 


SYSTEM  NUMBER 

Figure  15:  AEG  - lower  case  correlation 


Sy«lem  Number 

Syilem  N%me 

Correlation  ( all ) 

Correlation  (correct) 

AEG 

1.0000 

1 0000 

2 

VOTE^ 

0.8987 

0.8468 

i 

REFERENCE 

0 8726 

0 8726 

i 

UBOL 

0 8663 

0 8026 

6 

ERIM.l 

0 8644 

0 8112 

6 

UMICH.l 

0 8521 

0 7987 

7 

OCRSYS 

0 8508 

0.8046 

8 

KODAK-1 

0 8508 

0 8006 

9 

HUGHES. 1 

0 8440 

0 7933 

10 

ATT  J 

0.8438 

0 7989 

11 

HUGHES.2 

0 8427 

0 7916 

12 

ATTJ 

0 8422 

0.7886 

13 

ATT.2 

0 8396 

0 7975 

14 

NYNEX 

0 8373 

0.7979 

L& 

ATT.4 

0 8362 

0.7946 

16 

IBM 

0 8339 

0 7897 

17 

NESTOR 

0 8331 

0.7900 

18 

VOTE-P 

0 8247 

0 7950 

19 

GTESS.l 

0 8148 

0.7702 

20 

NIST-4 

0 8144 

0 7565 

21 

GTESS.2 

0 8067 

0 7617 

22 

NIST.l 

0,7924 

0.7542 

23 

GMD.J 

0 7826 

0 7391 

24 

RISO 

0 7817 

0,7360 

25 

ASOL 

0 7763 

0.7347 

26 

GMD_4 

0 7668 

0 7235 

27 

GMD.l 

0.7668 

0 7235 

28 

NISTJ 

0 7394 

0.7235 

29 

GMD.2 

0.7000 

0 6659 

30 

KAMAN.l 

0 6899 

0.6487 

31 

VALEN.l 

0 6885 

0 6446 

32 

NIST.2 

0 6702 

0 6384 

33 

KAMANJ) 

0 6677 

0 6266 

34 

KAMAN.2 

0.6531 

0.6137 

35 

KAMAN_S 

0.5755 

0.5407 

36 

K AMAN.4 

0 5438 

0.5117 

37 

COMCOM 

0 5011 

0.4882 

38 

UMICH-2 

0 0898 

0.0505 

Table  13:  AEG  correlation  graph  key  for  lowers. 
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SYSTEM:  ASOL 


PARTICIPANT:  Thomas  Baker 

ORGANIZATION:  Adaptive  Solutions,  Inc.,  Beaverton,  OR 
PREPROCESSING:  size  normalization  to  8x8 
FEATURES:  Digits:  raw 

Uppers:  raw  and  histograms  from  four  directions 
Lowers:  raw  and  histograms  from  four  directions 

CLASSIFICATION:  One  layer  Learning  Vector  Quantization  NN 


HARDWARE:  CNAPS  computer,  digital  SIMD  processor  array, 

64  processors  per  chip,  multiple  chips  per  boaird. 
Each  processor  is  similar  to  DSP. 


TRAINING 

DIGITS 

UPPERS 

LOWERS  DATABASE 

65000 

44951 

45313  NSDB3 

STATUS : 

on  time 

RESULTS : 

— DIGITS  — 

— UPPERS  — 

— LOWERS  — DATABASE 

REJ. 

ERR. 

REJ. 

ERR. 

REJ. 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE 

RATE— 

RATE 

RATE— 

0.00 

0.0891 

0.00 

0.1116 

0.00 

0.2125 

0.10 

0.0636 

0.10 

0.0842 

0.10 

0.1795 

0.20 

0.0377 

0.20 

0.0592 

0.20 

0.1597 

0.30 

0.0215 

0.30 

0.0423 

0.30 

0.1280 

0.40 

0.0192 

0.40 

0 . 0407 

0.40 

0.1062 

0.50 

0.0184 

0.50 

0 . 0457 

0.50 

0.0745 

OCR  RATE 

(CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

77.06 

51.92 

51.72 

CPU  RATE: 

1303.24 

459.27 

461.54 

NOTE:  Output  is  the  Euclidean  distance  between  nodes  in  the  network  and  the  input  vector.  Net 
reported  top  three  classes. 
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SYSTEM:  ASOL 
BIBLIOGRAPHY:  [3][4] 

COMMENTS: 

System  Description: 

The  OCR  system  submitted  by  Adaptive  Solutions  used  a Learning  Vector  Quajitization  (LVQ) 
neural  network  classifier.  LVQ  is  a single  layer,  winner- take- all  network.  Each  weight  vector  in  the 
network  is  assigned  to  a class.  There  can  be  more  than  one  weight  vector  assigned  to  each  class. 
The  network  uses  the  lowest  euclidean  distance  between  the  weight  vectors  and  the  input  vector 
to  determine  the  winning  class.  The  digit  network  had  224  output  nodes,  and  the  upper  and  lower 
case  networks  each  had  416  output  nodes. 

The  digits  were  normalized  to  an  8x8  array  and  input  to  the  network.  The  inputs  to  the  upper 
and  lower  case  networks  were  a combination  of  the  8x8  normalized  data  and  a histogram  of  the 
characters  taken  from  the  top,  bottom,  left  and  right  of  a 16x16  scale  normalized  array. 

To  report  the  confidence  of  the  clcissification  the  three  closest  weight  vectors  were  used.  Statistics 
were  accumulated  based  on  the  ordering  of  the  outputs.  The  statistics  were  put  into  a table  for 
reporting  the  confidence  of  the  test  data. 

The  neural  network  classifier  was  trained  and  tested  on  an  Adaptive  Solutions  neurocomputer  using 
a CNAPS  parallel  array  of  processors.  The  system  that  was  used  for  the  conference  results  had  32 
processors.  A system  that  used  64  processors  for  the  preprocessing  and  classification  of  the  test 
digits  achieved  a speed  of  over  1400  characters  per  second. 

For  questions  or  comments  please  contact: 

Thomas  Baker  INTERNET:  tom@asi.com  Adaptive  Solutions,  Inc.  UUCP:  uunet!adaptive!tom 
1400  N.W.  Compton  Drive,  Suite  340  PHONE:  (503)  690-1236  Beaverton,  Oregon  97006  FAX: 
(503)  690-1249 
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ERROR  RATE  (») 


ASOL  — DIGITS  UPPERS  LOWERS 


REJECTION  RATE  (%) 


Figure  16:  Error  rate  versus  rejection  rate  for  ASOL 


ASOL 


Figure  17:  Error  rate  per  writer  of  ASOL 
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ASOI-OKSrT.COlWELATE 


SYSTBi  NUMBER 


Figure  18:  ASOL  - digit  correlation 


System  Number 

Syatem  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

ASOL 

l.OOOO 

1 0000 

2 

VOTE_M 

0 9225 

0 9071 

3 

ATT.4 

0 9153 

0 8974 

4 

AEG 

0 9148 

0 8996 

b 

KODAK J 

0 9148 

0.8973 

6 

VOTEJ’ 

0.9147 

0 9037 

7 

ATT  J 

0 9143 

0 8989 

8 

ERIM-1 

0.9134 

0 8972 

9 

THINK.l 

0 9115 

0 8922 

10 

ELSAGB-3 

0 9114 

0 8985 

1 1 

ELSAGB_2 

0 9112 

0 8983 

12 

ATT  J 

0.9111 

0 8990 

13 

REFERENCE 

0 9109 

0,9109 

14 

OCRSYS 

0 9109 

0.9050 

15 

ERIM-2 

0,9100 

0,8954 

16 

NIST_4 

0 9097 

0 8907 

17 

ATT  J 

0 9093 

0 8924 

18 

KODAKU 

0.9093 

0 8921 

19 

IBM 

0.9085 

0 8965 

20 

SYMBUS 

0.9075 

0 8915 

21 

NESTOR 

0 9067 

0 8921 

22 

UBOL 

0 9067 

0 8918 

23 

ELSAGB.l 

0 9057 

0 8890 

24 

THINKS 

0 9018 

0 8917 

25 

HUGHES. 1 

0 9018 

0 8878 

26 

HUGHES. 2 

0 9011 

0 8873 

27 

GTESS.l 

0 9006 

0 8799 

28 

GTESS.2 

0 9001 

0 8790 

29 

NYNEX 

0 9000 

0 8889 

30 

REI 

0 8972 

0 8893 

31 

NIST.l 

0 8942 

0 8717 

32 

NIST.2 

0 8929 

0 8655 

33 

MIME 

0 8913 

0 8664 

34 

GMD-3 

0 8893 

0 8677 

35 

NIST  J 

0 8881 

0 8610 

36 

RISO 

0 8867 

0 8551 

37 

GMD.l 

0 8821 

0 8615 

38 

COMCOM 

0 8802 

0 8772 

39 

UPENN 

0 8773 

0 8577 

40 

KAMAN.l 

0 8701 

0 8446 

41 

GMD.4 

0 8683 

0 8479 

42 

KAMAN_3 

0 8541 

0 8291 

43 

KAMAN.2 

0 8527 

0 8271 

44 

GMD.2 

0 8392 

0,8116 

45 

KAMAN.S 

0.8302 

0.8080 

46 

VALEN.2 

0 8092 

0,7944 

47 

IFAX 

0 8057 

0 7860 

48 

VALEN.l 

0 8028 

0 7802 

49 

KAMAN_4 

0.7865 

0 7592 

Table  14:  ASOL  correlation  graph  key  for  digits. 
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ASOI-UPPfaCORRELATE 


SYSTEM  NUMBER 


Figure  19:  ASOL  - upper  case  correlation 


System  N u m ber 

System  Name 

Correlation  1 all ) 

Correlation  (correct) 

ASOL 

1 0000 

1.0000 

7 

VOTEJW 

0 9008 

0 8827 

3 

ATT.4 

0 8967 

0 8734 

4 

AEG 

0.8883 

0 8740 

6 

REFERENCE 

0 8884 

0 8884 

6 

ATT_2 

0 8824 

0 864  7 

7 

KODAK-1 

0 8819 

0 8392 

S 

V01EJ> 

0 8809 

0.8703 

9 

UBOL 

0 8797 

0 8398 

10 

ERIM.l 

0 8789 

0 8639 

1 1 

NYNEX 

0 8787 

0 8647 

12 

UMICH.l 

0 8781 

0 8633 

13 

ATTJ 

0 8769 

0 8370 

14 

SYMBUS 

0 8761 

0 8343 

li 

NESTOR 

0 8747 

0 8399 

16 

ATT  J 

0 8733 

0 8338 

17 

IBM 

0 8712 

0 8337 

18 

HUGHES. 1 

0 8673 

0 8318 

19 

GTESS.l 

0 8662 

0 8462 

20 

GTESS-2 

0 8638 

0 8433 

21 

HUGHES. 2 

0 8636 

0 8498 

22 

MIME 

0 8648 

0 8372 

23 

OCRSYS 

0 8363 

0 8441 

24 

NIST.4 

0 8308 

0 8283 

23 

RISC 

0 8409 

0 8085 

26 

NIST.l 

0 8309 

0 8036 

27 

GMD.l 

0 8233 

0 8004 

28 

REI 

0 8223 

0 8092 

29 

GMD.3 

0 8208 

0 7981 

30 

KAMAN.l 

0 81  16 

0 7903 

31 

GMD.4 

0 8061 

0 7836 

32 

NIST.3 

0 7972 

0 7761 

33 

K AM  AN.3 

0 7711 

0 7495 

34 

COMCOM 

0 7684 

0 7623 

33 

IFAX 

0 7638 

0,7449 

36 

K AMAN.2 

0 7630 

0 7421 

37 

NISTJ2 

0 7340 

0 7270 

38 

GMD.2 

0 7438 

0 7148 

39 

VALEN.l 

0 7249 

0 7043 

40 

KAMAN.4 

0 7107 

0 6853 

41 

KAMAN.3 

0 6360 

0 6189 

42 

UMICH.2 

0 0310 

0 0188 

Table  15:  ASOL  correlation  graph  key  for  uppers. 
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A80LJ.0WERC0nRELATE 


SYSTEM  NUMBER 


Figure  20:  ASOL  - lower  case  correlation 


Syitem  Number 

Syilem  Name 

Correlation  ( alt ) 

Correlation  (correct) 

ASOL 

1 0000 

1.0000 

2 

VOTEJ^ 

0 8108 

0 7660 

3 

ATT. 4 

0 7969 

0 7408 

i 

REFERENCE 

0 7875 

0 7875 

5 

ATT.2 

0 7801 

0 7355 

6 

KODAKJ 

0 7788 

0 7320 

7 

NYNEX 

0 7773 

0 7328 

8 

AEG 

0 7763 

0.7347 

9 

ERIM.l 

0.7757 

0 7345 

10 

ATTa 

0 7690 

0,7290 

1 1 

UBOL 

0 7634 

0.7189 

12 

IBM 

0 7633 

0 7203 

13 

ATT  J 

0 7628 

0 7193 

14 

NESTOR 

0,7617 

0 7222 

15 

UMICH.l 

0 7603 

0 7198 

16 

OCRSYS 

0 7600 

0 7242 

17 

VOTE_P 

0 7593 

0 7349 

18 

NIST.l 

0 7581 

0 7065 

19 

GTESS.l 

0 7561 

0 7084 

20 

GTESS.2 

0 7547 

0.7058 

21 

HUGHES. 1 

0 7512 

0 7128 

22 

HUGHES. 2 

0 7512 

0,7119 

23 

RISC 

0.7493 

0 6898 

24 

NIST.4 

0 7323 

0 6857 

25 

GMD.3 

0 7283 

0 6841 

26 

NISTJ 

0 7187 

0.6867 

27 

GMD.4 

0 7146 

0 6706 

28 

GMD.l 

0 7146 

0.6706 

29 

GMD.2 

0 6852 

0 6344 

30 

NIST.2 

0.6581 

0 6076 

31 

KAMAN.l 

0 6506 

0 6039 

32 

VALEN.l 

0 6450 

0 5987 

33 

KAMAN.3 

0 6314 

0 5863 

34 

KAMAN.2 

0 6149 

0 5710 

35 

KAMAN.S 

0.5412 

0 5042 

36 

KAMAN.4 

0 5184 

0 4772 

37 

COMCOM 

0 4651 

0 4545 

38 

UMICH.2 

0 llOl 

0.0513 

Table  16:  ASOL  correlation  graph  key  for  lowers. 
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SYSTEM:  ATT_1 


PARTICIPANT:  Dr.  Craig  R.  Nohl 

ORGANIZATION:  AT4T  Bell  Laboratories,  Holmdel.NJ 
FEATURES:  gray  levels  in  rescaled  image 

CLASSIFICATION:  k-NN  with  specially-designed  distance  measure  that 
can  compensate  for  some  common  distortions  such  as 
translation . 


HARDWARE 

SPARC2 

TRAINING 

DIGITS 

UPPERS 

LOWERS 

DATABASE 

220000 

44000 

44000 

NSDB3 

STATUS : 

on  time 

RESULTS : 

— DIGITS  — 

— UPPERS  — 

--  LOWERS  — 

DATABASE 

REJ. 

ERR. 

REJ. 

ERR. 

REJ. 

ERR. 

TESTDATAl 

RATE 

RATE-- 

RATE 

RATE— 

RATE 

RATE— 

0.00 

0.0316 

0.00 

0.0655 

0.00 

0.1378 

0.10 

0.0069 

0.10 

0.0331 

0.10 

0.1013 

0.20 

0.0025 

0.20 

0.0171 

0.20 

0.0706 

0.30 

0.0012 

0.30 

0.0094 

0.30 

0.0473 

0.40 

0.0011 

0.40 

0.0050 

0.40 

0.0310 

0.50 

0.0011 

0.50 

0.0028 

0.50 

0.0182 

OCR  RATE  (CPS) : DIGITS  UPPERS  LOWERS 

SYS  RATE:  0.30  0.32  0.20 

CPU  RATE: 
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SYSTEM:  ATT_1 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 
[5][6][7][8] 
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NUMBER  WRITERS  WITH  ERROR  E 


CiJ 

H 

S 

CL 

O 

ac 

CL 

LJ 


ATT  1 — DIGITS  UPPERS  LOWERS 


REJECTION  RATE  {%) 


Figure  21:  Error  rate  versus  rejection  rate  for  ATT_1 


ATT_1 


Figure  22:  Error  rate  per  writer  of  ATT.l 
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ATT.I.nafTfORRELATE 


SYSTEM  NtMBER 

Figure  23:  ATT_1  - digit  correlation 


System  N u m ber 

System  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

ATTJ 

1 0000 

1.0000 

2 

VOTE_M 

0 9693 

0-9580 

3 

REFERENCE 

0 9684 

0.9684 

4 

OCRSYS 

0 9653 

0 9601 

5 

AEG 

0 9698 

0-9487 

6 

ELS AGB J 

0 9594 

0 9491 

7 

ELSAGB^ 

0 9591 

0.9488 

8 

VOTE_P 

0.9680 

0.9496 

9 

ATT.4 

0 9673 

0 9445 

10 

ATTJ 

0 9666 

0 9462 

11 

ERIM.l 

0.9653 

0.9446 

12 

KODAK..2 

0.9644 

0 9431 

13 

IBM 

0 9636 

0 9466 

14 

ERIMJ 

0.9634 

0.9433 

15 

UBOL 

0 9611 

0,9403 

16 

THINKS 

0 9506 

0 9422 

17 

THINK.l 

0-9600 

0 9369 

l« 

NIST.4 

0 9486 

0.9357 

19 

KODAKJ 

0 9484 

0.9371 

20 

REl 

0 9467 

0.9400 

21 

NYNEX 

0 9462 

0 9380 

22 

SYMBUS 

0 9462 

0.9361 

23 

E L>S  A G 6 _1 

0.9462 

0 9345 

24 

NESTOR 

0 9469 

0.9372 

25 

ATTJ 

0.9469 

0 9369 

26 

HUGHES. 1 

0 9435 

0 9342 

27 

HUGHES.2 

0 9434 

0 9342 

28 

GTESS.l 

0 9327 

0 9209 

29 

COMCOM 

0.9323 

0.9298 

30 

GTESS.2 

0 9322 

0 9200 

31 

NIST-1 

0 9235 

0 9105 

32 

GMD.3 

0 9166 

0-9044 

33 

MIME 

0 9139 

0 9017 

34 

ASOL 

0.9111 

0 8990 

35 

GMD.l 

0 9091 

0.8985 

36 

NISTJ 

0 9080 

0.8961 

37 

UPENN 

0 9079 

0 8961 

38 

NISTJ 

0 9032 

0 8912 

39 

RISO 

0.8969 

0 8827 

40 

GMD.4 

0 8946 

0.8843 

41 

KAMAN.l 

0 8846 

0 8730 

42 

KAMAN.3 

0 8679 

0.8569 

43 

KAMAN.2 

0 8661 

0 8543 

44 

KAMAN.4 

0 8463 

0 8363 

45 

GMD-2 

0 8467 

0 8360 

46 

VALEN.2 

0 8382 

0 8298 

47 

IFAX 

0 8275 

0 8172 

48 

VALEN.l 

0.8161 

0 8071 

49 

KAMAN.4 

0.7922 

0-7821 

Table  17:  ATT_1  correlation  graph  key  for  digits. 
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ATr_1.UPPER.COI»REI-*TE 


SYSTEM  NUMBER 

Figure  24;  ATT.l  - upper  case  correlation 


Syiiem  Number 

Sy«iem  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

ATT  J 

1 0000 

1 0000 

2 

VOTE^ 

0 9397 

0 9245 

3 

REFERENCE 

0.9345 

0 9345 

4 

AEG 

0 9305 

0 9172 

5 

ATT.4 

0 9258 

0 9094 

6 

ATTJ 

0 9228 

0 9051 

7 

ERIM.l 

0 9215 

0,9066 

S 

NYNEX 

0 9209 

0.9069 

9 

UBOL 

0 9186 

0 8998 

10 

KODAK J 

0 9143 

0 8957 

1 1 

UMICH-l 

0 9140 

0 9022 

12 

VOTEJ* 

0 9139 

0.9049 

13 

NESTOR 

0 9111 

0.8977 

14 

ATT  J 

0 9085 

0 8929 

13 

IBM 

0 9059 

0 8934 

16 

HUGHES. 1 

0-9053 

0 8921 

17 

SYMBUS 

0.9051 

0 8887 

18 

GTESS.l 

0 9034 

0 8855 

19 

HUGHES.2 

0 9032 

0 8897 

20 

GTESS.2 

0.9017 

0.8841 

21 

OCRSYS 

0 8938 

0 8832 

22 

MIME 

0 8839 

0 8653 

23 

NIST.4 

0 8818 

0,8617 

24 

ASOL 

0 8735 

0.8558 

25 

REI 

0 8571 

0 8442 

26 

NIST.l 

0 8550 

0.8338 

27 

GMD.l 

0 8485 

0 8295 

28 

RISO 

0 8484 

0.8296 

29 

GMD.3 

0 8474 

0 8277 

30 

KAMAN.l 

0 8304 

0 8160 

31 

GMD.4 

0 8304 

0 8115 

32 

NISTJ) 

0 8076 

0 7972 

33 

COMCOM 

0 8031 

0 7968 

34 

KAMAN.3 

0 7872 

0 7723 

35 

IFAX 

0,7870 

0 7723 

36 

KAMAN.2 

0 7785 

0 7628 

37 

NISTJ 

0 7586 

0.7441 

38 

GMD.2 

0 7450 

0 7291 

39 

VALEN-l 

0.7404 

0.7262 

40 

KAMAN.4 

0 7178 

0 7012 

41 

KAMAN.S 

0 6478 

0 6356 

42 

UMICH  J 

0 0453 

0 0234 

Table  18:  ATT_1  correlation  graph  key  for  uppers. 
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ATT.I.LOWERCORIIELATE 


SYSTEM  NUMBER 

Figure  25:  ATT.l  - lower  case  correlation 


System  Number 

System  N^me 

Correlation  i all ) 

Correlation  (correct) 

1 

VfTj 

1 0000 

1 0000 

2 

VOTEJVl 

0.8773 

0 8315 

3 

REFERENCE 

0 8622 

0 8622 

4 

AEG 

0 8438 

0.7989 

b 

ERIM.l 

0 8375 

0 7938 

6 

ATT.2 

0 8370 

0 7927 

7 

OCRSYS 

0 8325 

0 7908 

8 

NYNEX 

0 8318 

0.7912 

9 

UBOL 

0 8288 

0-7796 

10 

ATT.4 

0 8275 

0 7864 

1 1 

KODAKU 

0 8261 

0 7853 

12 

IBM 

0 8224 

0 7793 

13 

ATT  J 

0 8202 

0 7749 

L4 

UMICH.l 

0 8168 

0 7767 

15 

NESTOR 

0 8142 

0 7780 

16 

HUGHES. 1 

0 8129 

0 7738 

17 

GTESS.l 

0 8114 

0 7647 

18 

HUGHES. 2 

0 8105 

0 7717 

19 

VOTEJ» 

0 8088 

0 7841 

20 

NIST.I 

0 8035 

0 7554 

21 

GTESS.2 

0 8018 

0 7563 

22 

NIST.4 

0 7885 

0 7403 

23 

GMD.3 

0 7753 

0 7320 

24 

RISC 

0 7724 

0 7281 

25 

ASOL 

0 7690 

0 7290 

26 

GMD.4 

0 7623 

0.7176 

27 

GMD.l 

0 7623 

0 7176 

28 

NISTJJ 

0 7383 

0 7210 

29 

GMD.2 

0 7010 

0 6637 

30 

VALEN.I 

0 6718 

0 6346 

31 

NIST  J 

0 6705 

0 6374 

32 

KAMAN.l 

0 6697 

0 6336 

33 

KAMAN.3 

0 6461 

0 6118 

34 

KAMAN.2 

0 631 1 

0 5983 

35 

KAMAN-5 

0 5658 

0 5338 

36 

KAMAN.4 

0 5254 

0 4979 

37 

COMCOM 

0 4951 

0.4848 

38 

UMICH.2 

0 1023 

0.0582 

Table  19:  ATT.l  correlation  graph  key  for  lowers. 
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SYSTEM:  ATT_2 


PARTICIPANT:  Dr.  Craig  R.  Nohl 

ORGANIZATION:  ATftT  Bell  Laboratories,  Holmdel.NJ 


FEATURES:  raw? 


CLASSIFICATION:  five  layer  NN  with  local  receptive  fields  and 
replicated  weights 


HARDWARE 

SPARC2 

TRAINING: 

DIGITS 

UPPERS 

LOWERS  DATABASE 

156000 

31000 

31000  NSDB3 

STATUS : 

on  time 

RESULTS : 

— DIGITS  — 

— UPPERS  — 

— LOWERS  — DATABASE 

REJ. 

ERR. 

REJ. 

ERR. 

REJ. 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE 

RATE— 

RATE 

RATE— 

0.00 

0.0367 

0.00 

0.0563 

0.00 

0.1406 

0.10 

0.0076 

0.10 

0.0180 

0.10 

0.0893 

0.20 

0.0023 

0.20 

0.0081 

0.20 

0.0538 

0.30 

0.0015 

0.30 

0.0039 

0.30 

0.0317 

0.40 

0.0009 

0.40 

0.0027 

0.40 

0.0151 

0.50 

0.0007 

0.50 

0.0020 

0.50 

0.0080 

OCR  RATE 

(CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE:  5.10  1.95  1.99 


CPU  RATE: 
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SYSTEM:  ATT^ 

BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 

[5][6][7][8] 
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NUMBER  WRITERS  WITH  ERROR 


U] 

H 

S 

CL 

O 

CL 

cc 

u 


ATT  2 — DIGITS  UPPERS  LOWERS 


REJECTION  RATE  (%) 


Figure  26:  Error  rate  versus  rejection  rate  for  ATT_2 


ATT_2 


Figure  27:  Error  rate  per  writer  of  ATT_2 
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ATT_ZOMrTrOWI£LATE 


SYSTEM  NUMBER 

Figure  28:  ATT_2  - digit  correlation 


S yiiem  N u m ber 

System  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

ATT  J 

1.0000 

1 ,0000 

2 

VOTEJVl 

0,9709 

0.9566 

3 

REFERENCE 

0 9633 

0.9633 

4 

OCRSYS 

0 9627 

0,9663 

5 

VOTEJ> 

0 9606 

0.9498 

6 

AEG 

0 9606 

0.9469 

7 

ATT-4 

0 9595 

0.9436 

a 

KODAKS 

0.9585 

0.9431 

9 

IBM 

0 9583 

0 9460 

10 

ERIM.l 

0.9580 

0 9440 

1 1 

ATT-1 

0-9566 

0,9462 

12 

ELSAGB J 

0.9562 

0 9454 

13 

ELSAGB J 

0 9558 

0 9450 

14 

ERIMJ 

0.9557 

0 9428 

15 

KODAKJ 

0 9523 

0.9371 

16 

NESTOR 

0.9518 

0.9384 

17 

ATT  J 

0 9497 

0,9360 

IS 

SYMBUS 

0.9488 

0 9356 

19 

UBOL 

0.9486 

0.9368 

20 

THINKS 

0 9484 

0,9394 

21 

THINK.l 

0 9474 

0 9341 

22 

HUGHES. 1 

0.9468 

0 9341 

23 

HUGHES.2 

0.9464 

0 9340 

24 

ELS  AGB.l 

0.9463 

0 9328 

25 

NIST.4 

0 9460 

0 9327 

26 

REI 

0 9459 

0.9377 

27 

NYNEX 

0 9459 

0.9360 

28 

GTESS.l 

0 9325 

0 9186 

29 

GTESSJ 

0 9318 

0-9176 

30 

COMCOM 

0 9285 

0.9258 

31 

NIST.l 

0 9204 

0.9072 

32 

MIME 

0 9163 

0 9013 

33 

GMD.3 

0 9161 

0.9033 

34 

ASOL 

0 9143 

0 8989 

35 

NISTJ 

0 9134 

0 8974 

36 

UPENN 

0.9091 

0 8953 

37 

NISTJ 

0-9089 

0 8923 

38 

GMD.l 

0 9082 

0 8967 

39 

RISO 

0-9015 

0 8839 

40 

GMD-4 

0 8933 

0.8822 

41 

KAMAN.l 

0.8922 

0 8752 

42 

KAMAN.3 

0 8747 

0 8587 

43 

KAMAN.2 

0.8715 

0 8559 

44 

GMD.2 

0 8525 

0 8370 

45 

KAMAN.5 

0 8508 

0 8375 

46 

VALEN.2 

0 8396 

0.8288 

47 

IFAX 

0 8319 

0.8181 

48 

VALEN-1 

0 8228 

0 8091 

49 

KAMAN.4 

0-7983 

0.7834 

Table  20;  ATT_2  correlation  graph  key  for  digits. 
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ATT_2.UPf>CRCORRELATE 


SYSTEM  NUMBER 

Figure  29:  ATT_2  - upper  case  correlation 


System  Num ber 

System  Name 

Correl&tton  (all) 

Correlation  (correct) 

1 

ATT^ 

1 0000 

1 0000 

2 

VOTEJVl 

0 9&2S 

0 9364 

3 

REFERENCE 

0 9437 

0 9437 

4 

ATT.4 

0 9422 

0 9223 

5 

AEG 

0 9384 

0 9262 

6 

NYNEX 

0 9307 

0 9166 

7 

ERIM.l 

0 9290 

0 9143 

8 

UMICH.l 

0,9266 

0 9131 

9 

VOTE  J* 

0 9264 

0 9148 

10 

ATT_1 

0 9228 

0 9061 

1 1 

KODAK J 

0 9226 

0 9039 

12 

NESTOR 

0 9224 

0,9080 

13 

UBOL 

0 9210 

0 9049 

14 

ATTJ 

0 9206 

0 9033 

IS 

IBM 

0 9198 

0 9048 

16 

SYMBUS 

0 9168 

0 8986 

17 

HUGHES. 1 

0 9131 

0 8996 

18 

HUGHES.2 

0 9106 

0 8969 

19 

GTESS-1 

0 9101 

0 8921 

20 

GTESS.2 

0 9091 

0 8912 

21 

OCRSYS 

0 9010 

0 8906 

22 

MIME 

0 8962 

0 8756 

23 

ASOL 

0 8824 

0 8647 

24 

NIST.4 

0 8812 

0 8668 

2S 

REI 

0 8668 

0 8631 

26 

RISC 

0 8619 

0 8406 

27 

NIST.l 

0 8629 

0 8364 

28 

GMD.l 

0 8488 

0 8349 

29 

GMD.3 

0 8482 

0 8333 

30 

KAMAN.l 

0 8429 

0 8261 

31 

GMD.4 

0 8303 

0 8164 

32 

NISTJ 

0 8169 

0 8049 

33 

COMCOM 

0 8084 

0 8024 

34 

KAMAN.3 

0 7973 

0 7813 

3S 

IFAX 

0 7946 

0 7799 

36 

KAMAN.2 

0 7870 

0 7712 

37 

NISTJ 

0.7668 

0 7611 

38 

GMD-2 

0 7546 

0 7376 

39 

VALEN.1 

0.7487 

0 7342 

40 

KAMAN.4 

0,7222 

0 7077 

41 

KAMAN.S 

0 6662 

0 6429 

42 

UMICHJ 

0 0408 

0 0231 

Table  21:  ATT_2  correlation  graph  key  for  uppers. 
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ATT_2.L0W€RC0«WELAT1 


SYSTEM  NUMBER 


Figure  30:  ATT_2  - lower  case  correlation 


Syilem  Number 

System  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

ATTJ 

l.OOOO 

l.OOOO 

2 

VOTE_M 

0 8879 

0.8372 

3 

REFERENCE 

0 8S94 

0.8694 

4 

ATT.4 

0 8S80 

0.8033 

S 

ERIMU 

0 840S 

0.7966 

6 

AEG 

0 8396 

0.7976 

7 

NYNEX 

0-8378 

0.7939 

8 

ATTU 

0.8370 

0.7927 

9 

KODAK_l 

0 83S7 

0.7911 

10 

IBM 

0 831S 

0.7868 

11 

OCRSYS 

0 8291 

0.7893 

12 

GTESS.l 

0 8267 

0 7734 

13 

UMICH.l 

0 8239 

0 7818 

14 

NESTOR 

0.8236 

0 7818 

IS 

ATTJ 

0 8236 

0.7780 

16 

VOTEJ» 

0-8216 

0 7928 

17 

UBOL 

0 8188 

0,7763 

18 

GTESS  J 

0 8184 

0 7662 

19 

HUGHES. 1 

0 8164 

0,7747 

20 

HUGHES.2 

0 8128 

0,7728 

21 

NIST.1 

0 7974 

0.7662 

22 

RISO 

0 7902 

0 7393 

23 

NIST.4 

0 7861 

0 7402 

24 

ASOL 

0 7801 

0.7366 

2S 

GMD.3 

0.7762 

0.7362 

26 

GMD.4 

0.7597 

0.7193 

27 

GMD.l 

0 7697 

0.7193 

28 

NISTJ) 

0.7627 

0 7306 

29 

GMDJJ 

0.7219 

0.6768 

30 

NISTJ 

0 6860 

0.6466 

31 

KAMAN.l 

0 6826 

0 6442 

32 

VALEN.1 

0 6747 

0.6376 

33 

KAMAN.3 

0 6696 

0 6227 

34 

KAMAN.2 

0 6416 

0 6076 

3S 

KAMAN.5 

0 6680 

0 6386 

36 

KAMAN.4 

0 6334 

0-6046 

37 

COMCOM 

0 6013 

0.4896 

38 

UMICHJ 

0 0944 

0.0668 

Table  22:  ATT_2  correlation  graph  key  for  lowers. 
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SYSTEM:  ATT. 3 


PARTICIPANT:  Dr.  Craig  R.  Nohl 

ORGANIZATION:  ATftT  Bell  Laboratories,  Holmdel.NJ 
FEATURES:  raw? 

CLASSIFICATION:  hybrid  of  feature-based  and  NN  classifiers. 
The  commercial  NCR  product . 

HARDWARE:  proprietary  board  based  on  Analog  Devices  2901 


TRAINING: 

DIGITS 

UPPERS 

LOWERS 

DATABASE 

140000 

26000 

23000 

NSDB3 

STATUS:  on  time 


RESULTS:  — DIGITS  — 

— UPPERS  — 

— LOWERS  — 

DATABASE 

REJ. 

ERR. 

REJ. 

ERR. 

REJ. 

ERR. 

TESTDATAl 

RATE 

RATE— 

RATE 

RATE— 

RATE 

RATE— 

0.00 

0 . 0484 

0.00 

0.0683 

0.00 

0.1634 

0.10 

0.0129 

0.10 

0.0297 

0.10 

0.1176 

0.20 

0.0126 

0.20 

0.0150 

0.20 

0.0856 

0.30 

0.0127 

0.30 

0.0071 

0.30 

0.0582 

0.40 

0.0128 

0.40 

0.0066 

0.40 

0.0382 

0.50 

0.0124 

0.50 

0.0065 

0.50 

0.0313 

OCR  RATE  (CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

146.38 

142.  i 

32 

146, 

,82 

CPU  RATE: 
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SYSTEM:  ATT_3 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 
[9][10] 
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NUMBER  WRITERS  WITM  ERROR 


CiJ 

H 

S 

CL 

O 

OL 

CL 

UJ 


ATT  3 — DIGITS  UPPERS  LOWERS 


REJECTION  RATE  (%) 


Figure  31:  Error  rate  versus  rejection  rate  for  ATT_3 


ATT_1 


Figure  32:  Error  rate  per  writer  of  ATT_3 
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ATr_1.0Mrr.COflRELATE 


SYSTEM  NUMeCR 

Figure  33:  ATT_3  - digit  correlation 


Syiiem  Number 

System  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

aTTj 

1,0000 

1-0000 

2 

VOTEJH 

0 9693 

0.9467 

AEG 

0 9633 

0,9384 

4 

REFERENCE 

0 9616 

0 9616 

5 

OCRSYS 

0 9606 

0 9447 

6 

VOTE_P 

0 9603 

0.9404 

7 

ATTJ 

0,9497 

0.9360 

8 

ERIM-1 

0.9486 

0.9346 

9 

ATT.4 

0,9473 

0 9329 

10 

ELSAGB-3 

0 9472 

0.9362 

L 1 

ELSAGB J 

0 9468 

0 9368 

12 

KODAK J 

0 9460 

0.9323 

13 

ATT-l 

0 9469 

0.9369 

14 

ERIM  J 

0 9460 

0.9326 

L5 

IBM 

0 9436 

0 9338 

16 

NIST.4 

0 9420 

0 9262 

17 

UBOL 

0 9417 

0 9289 

18 

NESTOR 

0 9404 

0.9282 

19 

KODAKJ 

0 9403 

0,9267 

20 

SYMBUS 

0 9399 

0 9268 

21 

ELSAGB.l 

0 9391 

0 9247 

22 

THINK  J 

0 9369 

0,9284 

23 

THINK.l 

0 9364 

0-9240 

24 

HUGHES. 2 

0 9349 

0 9236 

26 

HUGHES. 1 

0 9347 

0 9236 

26 

REI 

0 9337 

0 9268 

27 

NYNEX 

0 9336 

0.9261 

28 

GTESS.l 

0 9269 

0 9111 

29 

GTESS  J 

0 9262 

0 9101 

30 

COMCOM 

0 9181 

0,9166 

31 

GMD.3 

0 9133 

0,8977 

32 

NIST.l 

0 9132 

0.8996 

33 

MIME 

0 9103 

0.8942 

34 

ASOL 

0 9093 

0 8924 

36 

NISTJ 

0 9084 

0-8911 

36 

GMD.l 

0 9066 

0 8911 

37 

NISTJ 

0 9048 

0 8866 

38 

UPENN 

0 8998 

0 8867 

39 

RISC 

0 8936 

0 8763 

40 

GMD.4 

0 8907 

0 8768 

41 

KAMAN.l 

0 8869 

0 8696 

42 

KAMAN.3 

0 8696 

0.8630 

43 

KAMAN.2 

0 8680 

0 8612 

44 

GMD.2 

0.8604 

0.8326 

46 

KAMANJ. 

0 8466 

0 8314 

46 

VALEN.2 

0 8326 

0 8219 

47 

IFAX 

0 8264 

0 8123 

48 

valen.i 

0 8166 

0 8026 

49 

KAMAN.4 

0 7976 

0.7802 

Table  23:  ATT_3  correlation  graph  key  for  digits. 
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ATT_XUPI>EacORREl>TE 


SYSTEM  NUMBER 

Figure  34:  ATT_3  - upper  case  correlation 


S y«lem  N u m ber 

Sy«iem  Name 

Correlation  | all ) 

Correlation  (correct) 

1 

aTTj 

1 0000 

1 0000 

2 

VOTEJfcl 

0 9416 

0 9240 

3 

AEG 

0 9322 

0 9170 

4 

REFERENCE 

0 9317 

0 9317 

b 

ATT.4 

0 9300 

0 91 1 1 

6 

ERIM.l 

0 9223 

0 9064 

7 

ATTJ 

0.9206 

0 9033 

8 

NYNEX 

0 9186 

0 9062 

9 

VOTE-P 

0 9173 

0 9070 

10 

UBOL 

0.9169 

0 8989 

1 1 

UMICH.1 

0 9166 

0 9034 

12 

KODAKJ 

0 9141 

0.8946 

13 

NESTOR 

0 9139 

0 8991 

14 

IBM 

0 9087 

0 8949 

IS 

ATTJ 

0 9086 

0 8929 

16 

SYMBUS 

0 9082 

0 8897 

17 

HUGHES. 1 

0 9066 

0 8916 

18 

HUGHES. 2 

0 9044 

0 8896 

19 

GTESS.1 

0 9006 

0 8824 

20 

GTESSJ 

0 8999 

0 8816 

21 

OCRSYS 

0 8943 

0 8827 

22 

MIME 

0 8813 

0 8643 

23 

NIST.4 

0 8807 

0 8619 

24 

ASOL 

0 8769 

0 8670 

2S 

REI 

0 8696 

0 8466 

26 

RISC 

0 8637 

0 8321 

27 

GMD.l 

0 8493 

0.8306 

28 

NIST.1 

0 8463 

0 8284 

29 

GMD.3 

0 8467 

0 8278 

30 

KAMAN.l 

0 8386 

0 8204 

31 

GMD.4 

0 8294 

0 8119 

32 

NISTJ 

0 8168 

0.8016 

33 

COMCOM 

0 8029 

0 7973 

34 

K AMAN.3 

0.7961 

0.7767 

36 

IFAX 

0 7886 

0 7731 

36 

KAMAN.2 

0 7880 

0.7684 

37 

NISTJ2 

0.7631 

0 7462 

38 

GMD.2 

0 7604 

0 7322 

39 

VALEN.l 

0.7489 

0.7308 

40 

KAMAN.4 

0.7286 

0-7078 

41 

KAMAN J 

0 6616 

0.6384 

42 

UMICHJ 

0.0423 

0.0210 

Table  24:  ATT_3  correlation  graph  key  for  uppers. 
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att_i.lowercorrelate 


SYSTEM  MUMBER 


Figure  35:  ATT_3  - lower  Ccise  correlation 


System  Number 

System  Name 

Correlation  (all) 

Correlation  (correct) 

1 

aTT  J 

1 0000 

1 0000 

2 

VOTE^ 

0 8740 

0 8197 

3 

ERIM.l 

0 8464 

0.7899 

4 

AEG 

0 8422 

0 7886 

S 

REFERENCE 

0.8366 

0 8366 

6 

UMICH.l 

0 8273 

0-7747 

7 

UBOL 

0 8248 

0 7704 

S 

ATT_2 

0 8236 

0 7780 

9 

ATTJ 

0 8202 

0-7749 

10 

KODAK J 

0 8199 

0,7736 

1 1 

NYNEX 

0 8197 

0 7768 

12 

OCRSYS 

0 8194 

0 7744 

13 

ATT.4 

0 8192 

0 7746 

14 

IBM 

0 8183 

0 7699 

16 

NESTOR 

0 8141 

0 7681 

16 

VOTE_P 

0 8133 

0 7827 

17 

HUGHES. 1 

0.8108 

0.7648 

18 

HUGHES. 2 

0 8070 

0 7622 

19 

GTESS.l 

0 7972 

0 7496 

20 

NIST-4 

0 7960 

0.7376 

21 

GTESS  J 

0 7918 

0,7431 

22 

RISO 

0 7869 

0 7289 

23 

NIST.l 

0,7867 

0 7390 

24 

GMD.3 

0 7738 

0,7260 

26 

ASOL 

0 7628 

0.7193 

26 

GMD.4 

0.7674 

0 7100 

27 

GMD-1 

0,7674 

0 7100 

28 

NIST-3 

0 7382 

0 7129 

29 

GMD-2 

0 7100 

0 6637 

30 

VALEN-1 

0 6861 

0 6359 

31 

KAMAN-1 

0 6822 

0 6365 

32 

NIST-2 

0 6766 

0 6342 

33 

KAMAN.3 

0 6697 

0 6146 

34 

KAMAN-2 

0 6442 

0 6013 

36 

K AMAN-4 

0 6709 

0 6332 

36 

KAMAN.4 

0 6381 

0 5007 

37 

COMCOM 

0 4940 

0 4818 

38 

UMICH-2 

0,0862 

0 0456 

Table  25:  ATT_3  correlation  graph  key  for  lowers. 


102 


SYSTEM:  ATT_4 


PARTICIPANT:  Dr.  Craig  R.  Nohl 

ORGANIZATION:  AT&T  Bell  Laboratories,  Holmdel.NJ 
FEATURES:  raw? 


CLASSIFICATION:  vote  of  three  ? layer  NNs  with  local  receptive 
fields  and  replicated  weights 


HARDWARE : 

SPARC2 

TRAINING: 

DIGITS 

UPPERS 

LOWERS  DATABASE 

210000 

40000 

40000  NSDB3 

10000 

0 

0 USPS 

STATUS : 

on  time 

RESULTS:  — DIGITS  — 

— UPPERS  — 

— LOWERS  — DATABASE 

REJ. 

ERR. 

REJ. 

ERR. 

REJ. 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE 

RATE— 

RATE 

RATE— 

0.00 

0.0410 

0.00 

0.0500 

0.00 

0.1428 

0.10 

0.0098 

0.10 

0.0138 

0.10 

0.0968 

0.20 

0.0034 

0.20 

0.0059 

0.20 

0.0596 

0.30 

0.0014 

0.30 

0.0037 

0.30 

0.0334 

0.40 

0.0008 

0.40 

0.0015 

0.40 

0.0193 

0.50 

0.0003 

0.50 

0.0008 

0.50 

0.0113 

OCR  RATE  (CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

1.15 

1.03 

1.50 

CPU  RATE: 
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SYSTEM:  ATT.4 
BIBLIOGIL\PHY: 

The  following  references  have  been  provided  for  this  system: 
[5][6][7][8] 
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NUMBER  WRITERS  WITH  ERROR 


UJ 

H 

S 

cn 

o 

<x 

a: 

uj 


ATT  4 — DIGITS  UPPERS  LOWERS 


REJECTION  RATE  (%) 


Figure  36:  Error  rate  versus  rejection  rate  for  ATT_4 


ATT_4 


Figure  37:  Error  rate  per  writer  of  ATT_4 
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ATT_4.0tGrrX:OBBELATE 


SVSTOI  NUMBER 


Figure  38:  ATT_4  - digit  correlation 


System  Number 

System  Name 

Correlation  ( ali ) 

Correlation  (correct) 

1 

ATT.4 

1 0000 

1 0000 

2 

VOTE-M 

0 9703 

0.9341 

3 

AEG 

0 9603 

0 9446 

4 

ATT  J 

0 9393 

0.9436 

i 

VOTE_P 

0 9391 

0.9473 

6 

KODAK.2 

0 9391 

0.9410 

7 

REFERENCE 

0 9390 

0,9390 

9 

OCRSYS 

0 9383 

0 9320 

9 

ERlM.l 

0 9374 

0 9413 

10 

ATT_1 

0 9373 

0.9443 

11 

ELSAGBJJ 

0 9338 

0.9432 

12 

ELSAGB J 

0 9332 

0 9428 

13 

ERIMJ 

0 9331 

0.9391 

14 

KODAKJ 

0 9326 

0.9331 

13 

IBM 

0 9314 

0 9404 

16 

UBOL 

0 9488 

0 9348 

17 

SYMBUS 

0 9486 

0 9333 

18 

THINK.l 

0 9483 

0 9322 

19 

NIST.4 

0 94  79 

0 9315 

20 

ELSAGB.l 

0 9474 

0.931 1 

21 

ATT  J 

0.94  73 

0 9329 

22 

NESTOR 

0 9466 

0 9334 

23 

THINKS 

0 9443 

0 9331 

24 

HUGHES.! 

0 9439 

0 9303 

23 

HUGHES.2 

0 9429 

0,9299 

26 

NYNEX 

0 9422 

0 9321 

27 

REI 

0 9417 

0 9333 

28 

GTESS  J 

0 9336 

0 9175 

29 

GTESS.l 

0 9333 

0 9181 

30 

COMCOM 

0 9247 

0 9219 

31 

NlST.l 

0,9247 

0 9072 

32 

MIME 

0 9193 

0 9007 

33 

NISTj; 

0 9172 

0 8973 

34 

GMDJJ 

0 9170 

0 9016 

33 

ASOL 

0 9133 

0.8974 

36 

UPENN 

0 9120 

0 8942 

37 

NISTJJ 

0 9114 

0 8913 

38 

GMD.l 

0 9090 

0.8949 

39 

RISC 

0 9060 

0 8838 

40 

GMD.4 

0 8933 

0 8801 

41 

KAMAN-1 

0 8934 

0.8739 

42 

RAM  AN-3 

0.8746 

0 8366 

43 

KAMAN.2 

0 8721 

0 8342 

44 

GMD.2 

0 8333 

0 8333 

43 

KAMAN-3 

0,8303 

0.8332 

46 

VALEN.2 

0 8370 

0 8237 

47 

IFAX 

0 8297 

0 8132 

48 

VALEN.I 

0 8191 

0 8032 

49 

K AM  AN. 4 

0 8000 

0 7826 

Table  26:  ATT_4  correlation  graph  key  for  digits. 
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ATT_4.UPPCRCXMIREL*TE 


SVSTGU  NUMBER 


Figure  39:  ATT_4  - upper  Ccise  correlation 


Syiiem  Number 

Sy«iem  N&me 

Correi&iion  ( all ) 

Correlation  (correct) 

1 

AT'f.4 

1 0000 

1 0000 

2 

VOTE-M 

0.9663 

0 9461 

3 

REFERENCE 

0 9600 

0 9600 

4 

AEG 

0 9476 

0 9330 

5 

ATT-2 

0 9422 

0 9223 

6 

VOTE_P 

0 9386 

0 9268 

7 

ERIM.l 

0 9379 

0.9219 

9 

KODAKJ 

0 9374 

0 9140 

9 

NYNEX 

0 9371 

0 9227 

10 

UMICH.l 

0 9361 

0 9206 

L 1 

UBOL 

0 9313 

0 9132 

12 

NESTOR 

0 9308 

0 9166 

13 

SYMBUS 

0 9306 

0 9081 

14 

ATTJ 

0 9300 

0 9111 

IS 

IBM 

0 9298 

0 9127 

16 

ATTJ 

0 9268 

0.9094 

17 

HUGHES. 1 

0 9246 

0,9088 

18 

HUGHES-2 

0 9216 

0 9060 

19 

GTESS.l 

0 9168 

0.8981 

20 

GTESS_2 

0 9167 

0 8967 

21 

OCRSYS 

0 9091 

0 8983 

22 

MIME 

0 9077 

0 8838 

23 

ASOL 

0 8967 

0,8734 

24 

NIST.4 

0 8937 

0 8766 

2S 

RISO 

0 8760 

0.8488 

26 

REI 

0 8726 

0 8689 

27 

NIST.l 

0 8664 

0 8460 

26 

GMD.l 

0 8687 

0 8421 

29 

GMD.3 

0 8670 

0 8401 

30 

KAMAN.l 

0 8623 

0 8333 

31 

GMD.4 

0 8386 

0 8230 

32 

NISTJ 

0 8307 

0 8148 

33 

COMCOM 

0 8132 

0,8076 

34 

KAMAN.3 

0 8063 

0 7875 

3S 

IFAX 

0 8004 

0 7846 

36 

KAMAN.2 

0,7977 

0,7789 

37 

NISTJ 

0 7770 

0 7686 

38 

GMD.2 

0 7671 

0 7462 

39 

VALEN.l 

0 7696 

0.7416 

40 

KAMAN.4 

0.7361 

0 7162 

41 

KAMAN J 

0 6616 

0 6469 

42 

UMICH-2 

0 0384 

0.0218 

Table  27:  ATT_4  correlation  graph  key  for  uppers. 
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ATT_4.LOWER.CORMLATE 


SYSTOi  NUMBER 


Figure  40:  ATT_4  - lower  case  correlation 


System  Num ber 

System  Name 

Correl&tion  ( 4II ) 

Correlation  (correct) 

1 

ATt-l 

1 0000 

1.0000 

2 

VOTE-M 

0 8873 

0 8351 

3 

ATT.i 

0 8580 

0 8033 

4 

REFERENCE 

0 8572 

0.8572 

b 

KODAKJ 

0 8529 

0,7972 

6 

NYNEX 

0 8395 

0 7928 

7 

ERIM.l 

0 8374 

0 7934 

8 

AEG 

0 8362 

0-7946 

9 

IBM 

0 8303 

0 7828 

10 

NESTOR 

0 8301 

0 7837 

ll 

ATTJ 

0-8275 

0 7864 

12 

VOTEJ* 

0 8223 

0.7939 

13 

OCRSYS 

0,8219 

0-7849 

L4 

UMICH-l 

0 8210 

0 7791 

15 

ATT  J 

0.8192 

0,7745 

16 

GTESS.l 

0 8176 

0 7678 

17 

UBOL 

0 8173 

0 7740 

18 

HUGHES.l 

0 8153 

0.7734 

19 

HUGHES.2 

0 8130 

0 7718 

20 

GTESS  J 

0 8112 

0.7605 

21 

NIST.l 

0 8030 

0.7559 

22 

ASOL 

0 7969 

0 7408 

23 

RISC 

0.7947 

0.7391 

24 

GMD.3 

0 7878 

0,7390 

25 

NIST.4 

0 7847 

0,7387 

26 

GMD.4 

0.7718 

0 7232 

27 

GMD.l 

0.7718 

0 7232 

28 

NIST  J 

0.7650 

0.7349 

29 

GMD.2 

0 7287 

0 6779 

30 

KAMAN.l 

0 7004 

0 6510 

31 

NIST  J 

0 6933 

0 6484 

32 

VALEN.l 

0 6851 

0 6419 

33 

KAMAN.3 

0.6750 

0 6283 

34 

KAMAN-2 

0.6608 

0,6148 

35 

K AMAN_5 

0 5823 

0.5420 

36 

K AMAN-4 

0 5477 

0 5107 

37 

COMCOM 

0.5023 

0 4896 

38 

UMICH  J 

0.1065 

0 0583 

Table  28:  ATT_4  correlation  graph  key  for  lowers. 
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SYSTEM:  COMCOM 


PARTICIPANTS:  Mr.  Eberhard  Kuehl,  Perry  Riggs 
ORGANIZATION:  Com  Com  Systems,  Inc.,  Clearwater,  FL 
PREPROCESSING:  thinning. 

FEATURES:  ? 

CLASSIFICATION:  proprietary:  not  NN,  not  pixel  comparison,  nor 

vector  analysis.  Positional  information  and  features 
matched  against  database  of  "tables". 

HARDWARE:  386 

TRAINING:  DIGITS  UPPERS  LOWERS  DATABASE 

Number  used  is  proprietary  NSDB3 

Number  used  is  proprietary  INTERNAL 

STATUS:  on  time,  O’s  and  I’s  interchanged  in  RJX  files 

RESULTS:  — DIGITS  — — UPPERS  — — LOWERS  — DATABASE 


REJ. 

ERR. 

REJ. 

ERR.  REJ. 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE 

RATE--  RATE 

RATE— 

0.00 

0.0456 

0.00 

0.1694  0.00 

0.4800 

0.03 

0.0186 

0.15 

0.0242  0.45 

0.0590 

OCR  RATE  (CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

12.68 

11.71 

9.09 

CPU  RATE: 

NOTE:  Internal  database  contains  110000  hand  printed  digits,  220000  upper  case  letters,  and  at 
least  60000  mixed  uppers  and  lowers. 
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SYSTEM:  COMCOM 

The  following  references  have  been  provided  for  this  system: 
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NUMBER  WRITERS  WITH  ERROR  E 


COMCOM  I — DIGITS  UPPERS  LOWERS 


REJECTION  RATE  (%) 


Figure  41:  Error  rate  versus  rejection  rate  for  COMCOM 


COMCOM 


Figure  42:  Error  rate  per  writer  of  COMCOM 
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COMCOtU)IQIT.COIlflEL*TE 


SYSTEM  NUMBER 

Figure  43:  COMCOM  - digit  correlation 


System  Number 

System  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

COMCOM 

1-0000 

1.0000 

2 

REFERENCE 

0 9544 

0 9544 

3 

OCRSYS 

0 9471 

0.9445 

4 

VOTE-M 

0 9390 

0-9364 

S 

ATT  J 

0 9323 

0.9298 

6 

VOTE_P 

0 9321 

0.9271 

7 

AEG 

0.9314 

0,9286 

8 

ELSAGB J 

0.9313 

0.9286 

9 

ELSAGB^ 

0.931 1 

0 9284 

10 

IBM 

0 9311 

0 9280 

1 1 

THINK_2 

0 9302 

0 9264 

12 

ATTJ 

0 9285 

0 9258 

13 

REI 

0 9285 

0 9251 

14 

ERIM.l 

0 9279 

0.9249 

15 

ERIM  J 

0 9278 

0 9250 

16 

NYNEX 

0 9255 

0.9219 

17 

ATT.4 

0 9247 

0.9219 

18 

KODAK_2 

0.9244 

0 9218 

19 

UBOL 

0 9224 

0 9198 

20 

NESTOR 

0 9215 

0.9186 

21 

HUGHES. 1 

0 9206 

0.9170 

22 

HUGHES. 2 

0 9203 

0 9168 

23 

SYMBUS 

0,9197 

0 9169 

24 

KODAKU 

0 9186 

0 9158 

25 

ATTJ 

0 9181 

0 9155 

26 

THINK.l 

0,9171 

0 9145 

27 

NIST.4 

0.9170 

0 9141 

28 

ELSAGB.l 

0 9163 

0 9133 

29 

GTESS.l 

0 9013 

0 8986 

30 

GTESS J 

0 9000 

0 8972 

31 

NIST.l 

0 8903 

0 8875 

32 

GMD.3 

0 8875 

0 8847 

33 

MIME 

0.8831 

0 8800 

34 

UPENN 

0 8819 

0 8782 

35 

GMD.l 

0 8818 

0 8789 

36 

ASOL 

0 8802 

0 8772 

37 

NIST.2 

0 8762 

0 8739 

38 

NIST  J 

0 8713 

0 8689 

39 

GMD.4 

0 8686 

0 8657 

40 

RISO 

0 8633 

0 8607 

41 

KAMAN.l 

0 8557 

0 8527 

42 

K AMAN_3 

0 8410 

0 8378 

43 

KAMAN.2 

0 8386 

0 8353 

44 

K AMAN.S 

0 8223 

0 8189 

45 

valen j 

0.8196 

0 8159 

46 

GMD.2 

0 8183 

0,8159 

47 

IFAX 

0.8072 

0 8032 

48 

valen-1 

0,7946 

0 7914 

49 

KAMAN.4 

0,7676 

0-7646 

Table  29:  COMCOM  correlation  graph  key  for  digits. 
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COMCOHUPPERCOnflELATE 


SYSTEM  NUMBER 

Figure  44:  COMCOM  - upper  case  correlation 


System  Number 

System  N%me 

Correlation  ( ail ) 

Correlation  (correct) 

1 

COMCOM 

1 0000 

1 0000 

2 

VOTE_P 

0 8324 

0 8052 

3 

REFERENCE 

0 8306 

0 8306 

4 

VOTE^ 

0 8261 

0 8205 

5 

AEG 

0 8222 

0 8168 

6 

NYNEX 

0 8180 

0 8113 

7 

ATT-< 

0 8132 

0 8075 

g 

ERIM-l 

0 8132 

0 8071 

9 

UMICH-1 

0 8132 

0 8066 

10 

ATT  J 

0 8084 

0 8024 

11 

NESTOR 

0 8065 

0,8014 

12 

IBM 

0 8065 

0 8004 

13 

UBOL 

0 8060 

0 7999 

14 

HUGHES-1 

0 8053 

0 7986 

15 

HUGHES. 2 

0 8040 

0 7971 

16 

ATTJ 

0 8031 

0 7968 

17 

ATT  J 

0 8029 

0 7973 

1» 

KODAKJ 

0 8007 

0 7949 

19 

SYMBUS 

0 7995 

0 7941 

20 

OCRSYS 

0 7987 

0-7927 

21 

GTESS.l 

0 7931 

0.7875 

22 

GTESS J 

0 7930 

0 7870 

23 

MIME 

0 7764 

0 7709 

24 

NIST.4 

0.7728 

0 7669 

25 

ASOL 

0 7684 

0 7623 

26 

REI 

0 7668 

0 7612 

27 

NIST.l 

0.7474 

0.7421 

28 

RISO 

0 7469 

0 7409 

29 

GMD.l 

0 7462 

0 7402 

30 

GMD.3 

0 7452 

0 7391 

31 

KAMAN-1 

0 7391 

0.7327 

32 

GMD.4 

0 7324 

0.7261 

33 

NISTJ 

0 7133 

0,7119 

34 

IFAX 

0 7068 

0 6995 

35 

KAMAN-3 

0 6997 

0 6937 

36 

KAMAN-2 

0 6890 

0 6833 

37 

NIST.2 

0 6697 

0.6660 

38 

GMD.2 

0 6618 

0 6569 

39 

VALEN.I 

0 6598 

0 6540 

40 

KAMAN.4 

0 6335 

0.6282 

41 

KAMAN.S 

0 5768 

0 5718 

42 

UMICHJ 

0 0274 

0,0175 

Table  30:  COMCOM  correlation  graph  key  for  uppers. 
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COMCOtLLOWERCORWLATE 


9VSmi  NUMBER 

Figure  45:  COMCOM  - lower  case  correlation 


System  Number 

System  N%me 

Correlation  ( all } 

Correlation  (correct) 

1 

COMCOM 

1 0000 

l.OOOO 

2 

VOTE-P 

0 5966 

0 4902 

3 

VOTEJH 

0 5209 

0 5082 

4 

REFERENCE 

0 5200 

0 5200 

6 

NYNEX 

0 4074 

0 4937 

6 

ERIM.l 

0.5061 

0.4923 

7 

ATT-4 

0 5023 

0 4896 

8 

ATT.2 

0 5013 

0.4896 

9 

AEG 

0 501 1 

0 4882 

10 

KODAKJ 

0 5007 

0 4876 

1 1 

HUGHES. 1 

0 5004 

0 4886 

12 

HUGHES.2 

0 4993 

0 4872 

13 

OCRSYS 

0 4980 

0.4882 

14 

IBM 

0 4965 

0.4856 

13 

ATTJ 

0 4951 

0.4848 

16 

ATTJ 

0 4940 

0.4818 

17 

NESTOR 

0 4933 

0.4822 

18 

UMICH.l 

0 4923 

0 4823 

19 

UBOL 

0.4919 

0 4 798 

20 

GTESS.l 

0.4873 

0.4757 

21 

GTESS.2 

0 4830 

0.4712 

22 

NIST.l 

0 4782 

0.4678 

23 

GMDJ) 

0 4712 

0.4594 

24 

NIST.4 

0 4688 

0.4564 

25 

ASOL 

0 4651 

0 4545 

26 

RISO 

0 4«27 

0.4540 

27 

GMD.4 

0 4622 

0 4513 

28 

GMD.l 

0 4622 

0.4513 

29 

NISTJ 

0 4565 

0.4523 

30 

GMD.2 

0 4342 

0 4247 

31 

KAMAN.l 

0 4218 

0 4098 

32 

VALEN.l 

0.4204 

0 4100 

33 

NISTJ 

0 4168 

0 4075 

34 

KAMAN.3 

0 4097 

0 3981 

35 

KAMAN-2 

0 3975 

0 3869 

36 

KAMAN.5 

0 3582 

0 3483 

37 

KAMAN-4 

0 3302 

0 3223 

38 

UMICH  J 

0 0438 

0 0306 

Table  31:  COMCOM  correlation  graph  key  for  lowers. 
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SYSTEM:  ELSAGB.l 


PARTICIPANT:  Mr.  Francesco  Fignoni 

ORGANIZATION:  ELSAG  BAILEY,  INC.,  Conshohocken,  PA 

PREPROCESSING:  noise  removal  2ind  size  normalization  to  24x36. 

FEATURES:  shape  function  of  the  character  bit  maps  having  the  same 
size  as  the  character. 

CLASSIFICATION:  KNN  with  respect  to  shape  function  distance  from  references 
representing  clusters  of  shape  fiinctions  in  training  sample 

HARDWARE:  33  MHz  386  rvinning  tight  assembly  code 


TRAINING 

DIGITS 

UPPERS 

LOWERS  DATABASE 

85491 

NA 

NA  NSDB3 

STATUS : 

on  time 

RESULTS : 

--  DIGITS  — 

— UPPERS  — 

--  LOWERS  — DATABASE 

REJ. 

ERR. 

REJ.  ERR. 

REJ.  ERR.  TESTDATAl 

RATE 

RATE— 

RATE  RATE— 

RATE  RATE- 

0.00 

0.0507 

0.08 

0.0179 

0.12 

0.0114 

OCR  RATE 

(CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

CPU  RATE: 

65.00 

NA 

NA 

NOTE:  This  is  the  system  used  in  their  postal  OCR.  Few  details  of  the  recognition  algorithm  were 
provided. 
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SYSTEM:  ELSAGBJ 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 
none 

COMxMENTS:  ELSAG  BAILEY 
SPECIFIC  ABOUT  ELSAG  BAILEY 

- AFTER  receiving  the  TESTDATAl  CD  ROM,  that  is  the  test  set,  Elsag  Bailey  neither  modified 
in  any  part  or  tuned  in  any  way  the  recognition  units  and  associated  data-bases  produced  from 
training  for  the  tests  ELSAGB.l,  ELSAGB_2,  and  ELSAGB_3. 

Elsag  Bailey  is  aware  of  the  fact  that  given  the  poor  relationship  between  training  and  test  sets, 
these  countermeasures  could  prove  useful. 

- ELS  AG.l  had  some  troubles  dealing  with  the  thickness  range  of  characters:  about  4%  the  training 
digits  have  an  average  thickness  of  less  than  2 or  more  than  9 pixels. 

GENERAL  ABOUT  THE  CONFERENCE 

- Elsag  Bailey  appreciated  the  way  the  test  and  Conference  were  set  up  by  NIST.  It  was  something 
between  an  acceptance  test  and  a scientific  conference  and  proved  itself  both  useful  and  interesting. 

- The  test  set  for  digits  was  both  very  difficult  and  very  ’’far”  from  the  training  set;  this  fact 
produced  rather  conservative  recognition  results. 

One  reason  is  the  fact  that  the  training  set  did  not  contain  examples  of  the  difficult  test  characters. 
If  it  had,  performance  would  have  been  higher.  The  other  reason  is  that  the  test  characters  are 
poor  in  quahty,  probably  representing  the  low  end  in  a real  environment. 

While  these  points  do  not  weaken  the  relative  comparisons  among  the  participants,  nevertheless, 
they  compromise  the  absolute  meaning  of  the  recognition  performance. 

- A good  estimate  of  segmentation  performance,  that  is,  the  next  important  part  of  the  whole  OCR 
process,  is  an  open  question. 

In  fact,  the  scoring  procedure  should  be  independent  from  the  recognition  unit  and  automatic. 
Otherwise,  the  two  procedures  are  mixed  together. 
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NUMBER  WRITERS  WITH  ERROR 


100.0 


ELSAGB  1 


DIGITS 


REJECTION  RATE  (%) 


Figure  46;  Error  rate  versus  rejection  rate  for  ELSAGB.l 


ELSAOB_1 


Figure  47:  Error  rate  per  writer  of  ELSAGB.l 
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ELSAC8_1.0IGIT.CORRELATE 


SYSTEM  NUMBER 


Figure  48:  ELSAGB.l  - digit  correlation 


System  Num ber 

System  N&me 

Correlation  { all ) 

Correlation  (correct) 

1 

£ L S A G B .1 

1 0000 

1 0000 

2 

VOTE-M 

0 9606 

0 9444 

i 

ELS AGB J 

0 9578 

0-9393 

4 

ELSAGB-2 

0 9570 

0 9388 

5 

AEG 

0 9555 

0 9378 

6 

VOTE_P 

0 9506 

0 9390 

7 

ERIM.l 

0 9495 

0,9331 

8 

REFERENCE 

0 9493 

0 9493 

9 

OCRSYS 

0 9485 

0 9425 

10 

KODAKS 

0 9480 

0 9313 

1 1 

ATT.4 

0 9474 

0 9311 

12 

ATT-2 

0.9463 

0 9328 

13 

ATTU 

0 9462 

0 9345 

14 

UBOL 

0.9455 

0 9287 

15 

ERIM_2 

0 9454 

0 9312 

16 

IBM 

0 9447 

0 9325 

17 

NIST-4 

0 9439 

0 9249 

18 

KODAK J 

0 9431 

0.9260 

19 

NESTOR 

0 9392 

0 9254 

20 

HUGHES. 1 

0 9392 

0 9243 

21 

ATT  J 

0 9391 

0 9247 

22 

HUGHES-2 

0 9389 

0 9241 

23 

THINK. 1 

0 9386 

0 9228 

24 

THINK.2 

0 9384 

0.9277 

25 

SYMBUS 

0 9379 

0 9240 

26 

REI 

0 9334 

0 9249 

27 

NYNEX 

0 9329 

0 9228 

28 

GTESS-l 

0 9270 

0 9097 

29 

GTESS.2 

0 9263 

0,9087 

30 

COMCOM 

0 9163 

0 9133 

31 

NIST.l 

0 9155 

0 8981 

32 

GMD.3 

0 9133 

0 8958 

33 

NIST.2 

0 9065 

0 8881 

34 

MIME 

0,9059 

0 8902 

35 

ASOL 

0 9057 

0 8890 

36 

GMD.l 

0 9056 

0 8894 

37 

UPENN 

0 9033 

0,8861 

38 

NISTJ) 

0 9010 

0 8830 

39 

RISC 

0 8912 

0 8730 

40 

GMD.4 

0 8910 

0 8751 

41 

KAMAN.l 

0.8865 

0 8666 

42 

KAMAN.3 

0 8720 

0 8513 

43 

KAMAN.2 

0 8695 

0 8490 

44 

KAMAN.5 

0 8491 

0 8303 

45 

GMD.2 

0 8470 

0 8291 

46 

VALEN.2 

0 8327 

0 8203 

47 

IFAX 

0 8247 

0 8092 

48 

VALEN.l 

0 8178 

0 8003 

49 

KAMAN-4 

0 7964 

0 7775 

Table  32:  ELSAGB.l  correlation  graph  key  for  digits. 
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j 


No  Data  Available 


Figure  49:  ELSAGB.l  - upper  case  correlation 


There  no  d&t*  for  thi»  ev&luAtion 

Table  33:  ELSAGB_1  correlation  graph  key  for  uppers. 
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No  Data  Available 


Figure  50:  ELSAGB_1  - lower  case  correlation 


There  no  data  for  thi«  evaluation 

Table  34:  ELSAGB.l  correlation  graph  key  for  lowers. 
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SYSTEM:  ELSAGB_2 


PARTICIPANT;  Mr.  Francesco  Fignoni 

ORGANIZATION:  ELSAG  BAILEY,  INC.,  Conshohocken , PA 

PREPROCESSING:  noise  removal  amd  size  normalization  to  24x36. 

FEATURES:  shape  function  of  the  chauracter  bit  maps  having  the  same 
size  as  the  chauracter. 

CLASSIFICATION:  KNN  with  respect  to  shape  function  distance  from  references 
representing  clusters  of  shape  functions  in  training  sample 
(the  classifier  used  with  ELSAGB.l)  for  preclassification 
followed  by  the  same  classifier  using  a more  sophisticated 
distance  measure  auid  many  more  references 

HARDWARE:  33  MHz  386  running  tight  assembly  for  preclassification 

VAX  6000/410  under  VMS  running  FORTRAN  for  classification 


TRAINING: 

DIGITS 

UPPERS 

LOWERS 

DATABASE 

85491 

NA 

NA 

NSDB3 

STATUS : 

on  time 

RESULTS:  — DIGITS  — — UPPERS  — — LOWERS  — DATABASE 


REJ. 

ERR. 

REJ. 

ERR . REJ . 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE 

RATE—  RATE 

RATE- 

0.00 

0.0338 

0.05 

0.0135 

0.08 

0.0097 

0.10 

0.0077 

0.11 

0.0068 

OCR  RATE  (CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

CPU  RATE: 

0.30 

NA 

NA 

NOTE:  This  is  a laboratory  research  system.  Few  details  of  the  recognition  algorithm  were  provided. 
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SYSTEM:  ELSAGB^ 

BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 
none 
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ERROR  RATE  (%| 


ELSAGB  2 --  DIGITS 


REJECTION  RATE  (%) 


Figure  51:  Error  rate  versus  rejection  rate  for  ELSAGB_2 


ELSAQB  2 


Figure  52:  Error  rate  per  writer  of  ELSAGB_2 
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ELSAQB.Z.OlQn’.COnRELATE 


SYSTEM  NUMeER 


Figure  53:  ELSAGB_2  - digit  correlation 


Sysiem  Number 

System  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

ELSAGB-2 

1 0000 

1,0000 

2 

ELSAGBJ 

0 9983 

0 9655 

3 

VOTE^ 

0 9721 

0.9583 

4 

REFERENCE 

0 9662 

0 9662 

5 

AEG 

0 9642 

0.9499 

6 

OCRSYS 

0 9632 

0,9579 

7 

VOTE_P 

0 9606 

0.9506 

8 

ATT-l 

0.9591 

0.9488 

9 

ERIM.l 

0,9572 

0,944T 

10 

ELSAGB.l 

0 9570 

0.9388 

11 

UBOL 

0.9561 

0.9416 

12 

ATTJ 

0 9558 

0.9450 

13 

IBM 

0 9553 

0.9457 

14 

ATT.4 

0.9552 

0 9428 

15 

KODAK J 

0.9545 

0.9423 

16 

ERIMJ 

0 9535 

0.9428 

17 

NIST-4 

0.9517 

0.9365 

18 

THINKS 

0 9499 

0 9410 

19 

KODAK-! 

0 9492 

0.9368 

20 

THINK-1 

0 9482 

0 9352 

21 

NESTOR 

0 9471 

0.9368 

22 

ATT  J 

0 9468 

0 9358 

23 

SYMBUS 

0 9462 

0.9352 

24 

HUGHES. 2 

0.9454 

0 9345 

25 

HUGHES.! 

0.9453 

0 9347 

26 

REI 

0 9451 

0-9383 

27 

NYNEX 

0.9440 

0.9360 

28 

GTESS.! 

0 9328 

0 9200 

29 

GTESS  J 

0 9319 

0,9186 

30 

COMCOM 

0 9311 

0 9284 

31 

NIST.! 

0 9230 

0 9090 

32 

GMD.3 

0 9194 

0 9055 

33 

MIME 

0 9128 

0 9006 

34 

GMD.l 

0 9118 

0 8990 

35 

ASOL 

0 9112 

0 8983 

36 

NIST_2 

0 9083 

0 8958 

37 

UPENN 

0 9080 

0 8953 

38 

NISTJ 

0 9040 

0 8909 

39 

GMD.4 

0 8970 

0.8846 

40 

RISO 

0 8955 

0.8820 

41 

KAMAN.l 

0 8876 

0.8739 

42 

KAMANJI 

0.8720 

0 8579 

43 

KAMAN.2 

0 8696 

0.8554 

44 

KAMAN.5 

0.8499 

0.8369 

45 

GMD.2 

0 8479 

0.8354 

46 

VALENJ 

0.8384 

0 8293 

47 

IFAX 

0 8281 

0.8172 

48 

VALEN.! 

0 8201 

0 8079 

49 

KAMAN.4 

0.7965 

0 7831 

Table  35:  ELSAGB_2  correlation  graph  key  for  digits. 
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No  Data  Available 


Figure  54:  ELSAGB_2  - upper  case  correlation 


There  wa«  no  dai«  for  thi«  ev&luAtion 

Table  36:  ELSAGB_2  correlation  graph  key  for  uppers. 
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No  Data  Available 


Figure  55:  ELSAGB_2  - lower  case  correlation 

There  w&i  no  d&t*  for  lhi«  evaluation 

Table  37:  ELSAGB_2  correlation  graph  key  for  lowers. 
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SYSTEM:  ELSAGB_3 


PARTICIPANT:  Mr.  Framcesco  Fignoni 

ORGANIZATION:  ELSAG  BAILEY,  INC.,  Conshohocken , PA 

PREPROCESSING:  noise  removal  and  size  normalization  to  24x36. 

FEATURES:  shape  function  of  the  character  bit  maps  having  the  same 
size  as  the  character. 

CLASSIFICATION:  KNN  with  respect  to  shape  fimction  distance  from  references 
representing  clusters  of  shape  functions  in  training  sample 
(the  classifier  used  with  ELSAGB_1)  for  preclassification 
followed  by  the  same  classifier  using  a more  sophisticated 
distance  measure  and  many  more  references 

HARDWARE:  33  MHz  386  running  tight  assembly  for  preclassification 

VAX  6000/410  under  VMS  running  FORTRAN  for  classification 


TRAINING: 

DIGITS 

UPPERS 

LOWERS  DATABASE 

85491 

NA 

NA  NSDB3 

STATUS : 

on  time 

, lost  at  NIST 

until 

after  Conference 

RESULTS : 

— DIGITS  — 

— UPPERS  — 

— LOWERS  — DATABASE 

REJ. 

ERR. 

REJ . ERR . 

REJ. 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE  RATE-- 

RATE 

RATE- 

0.00 

0.0335 

0.04 

0.0180 

0.07 

0.0102 

OCR  RATE 

(CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

CPU  RATE: 

0.30 

NA 

NA 

NOTE:  This  is  a laboratory  research  system.  Few  details  of  the  recognition  algorithm  were  provided. 


127 


SYSTEM;  ELSAGB.3 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 
none 
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NUMBER  WRITERS  WITH  ERROR 


100.0 


ELSAGB  3 


DIGITS 


REJECTION  RATE  (%) 


Figure  56:  Error  rate  versus  rejection  rate  for  ELSAGB_3 


ELSAOBJ 


Figure  57:  Error  rate  per  writer  of  ELSAGB_3 
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ELSAG8_1.0iGrT.CORRELATE 


SYSTEM  NUMBER 


Figure  58:  ELSAGB_3  - digit  correlation 


System  Nu m ber 

System  N%me 

Correl&tion  ( All ) 

Correlation  (correct) 

1 

ELS AGB J 

1 0000 

1 0000 

2 

ELSAGB J 

0 9083 

0 9655 

3 

VOTE-M 

0 9726 

0 9587 

4 

REFERENCE 

0 9665 

0 9665 

AEG 

0 9645 

0,9502 

6 

OCRSYS 

0 9636 

0 9583 

7 

VOTEJ” 

0 9611 

0 9510 

8 

ATTJ 

0 9594 

0.9491 

9 

ERIM.l 

0 9578 

0 9451 

10 

ELSAGB.l 

0 9578 

0.9393 

L 1 

UBOL 

0 9566 

0 9420 

12 

ATTJ 

0 9562 

0 9454 

13 

ATT.4 

0 9558 

0,9432 

14 

IBM 

0 9556 

0 9460 

IS 

KODAK 

0 9550 

0.9427 

16 

ERIMJ 

0 9543 

0 9433 

17 

NIST.4 

0 9522 

0 9369 

18 

THINKS 

0 9502 

0 9413 

19 

KODAKJ 

0 9497 

0 9371 

20 

THINK.l 

0 9486 

0 9356 

21 

NESTOR 

0 9477 

0 9373 

22 

ATT  J 

0 94  72 

0 9362 

23 

SYMBUS 

0 9467 

0.9356 

24 

HUGHES.2 

0 9459 

0.9349 

2S 

HUGHES. 1 

0 9457 

0 9350 

26 

REI 

0 9455 

0.9387 

27 

NYNEX 

0.9443 

0 9363 

28 

GTESS.l 

0 9330 

0 9203 

29 

GTESS.2 

0 9322 

0 9189 

30 

COMCOM 

0 9313 

0 9286 

31 

NIST.l 

0 9231 

0 9092 

32 

GMD.3 

0 9198 

0,9059 

33 

MIME 

0 9132 

0 9010 

34 

GMD.l 

0 9120 

0 8993 

35 

ASOL 

0 9114 

0 8985 

36 

NIST  J 

0 9087 

0 8962 

37 

UPENN 

0 9086 

0 8958 

38 

NIST  J 

0 9044 

0 8913 

39 

GMD.4 

0 8972 

0 8849 

40 

RISO 

0 8958 

0 8823 

41 

KAMAN.l 

0 8879 

0 8742 

42 

KAMAN.3 

0 8723 

0 8582 

43 

KAMAN.2 

0 8699 

0 8557 

44 

K AMAN.S 

0 8501 

0 8371 

45 

GMD.2 

0 8482 

0 8356 

46 

VALEN.2 

0 8387 

0 8296 

47 

IFAX 

0 8287 

0 8177 

48 

VALEN.1 

0 8204 

0 8082 

49 

KAMAN.4 

0 7968 

0 7834 

Table  38:  ELSAGB_3  correlation  graph  key  for  digits. 
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No  Data  Available 


Figure  59:  ELSAGBJ3  - upper  Ccise  correlation 


There  no  d%tft  for  thi«  evaluation. 


Table  39:  ELSAGB_3  correlation  graph  key  for  uppers. 
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No  Data  Available 


Figure  60:  ELSAGB_3  - lower  case  correlation 


Therf  no  for  thi«  ev&lu&tion 


Table  40:  ELSAGB_3  correlation  graph  key  for  lowers. 
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SYSTEM:  ERIM.l 


PARTICIPANT:  Steven  Schlosser 


ORGANIZATION:  Elnvironmental  Research  Institute  of  Michigan  (ERIM) 
Ann  Arbor,  Michigan 


PREPROCESSING:  filtering  and  size  normalization 

FEATURES:  stroke  detection,  morphological 
feature  extraction. 

CLASSIFICATION:  four  layer  NN  with  BP.  For  digits,  245  input  units, 
and  two  hidden  layers  with  25  amd  15  hidden  units , 
10  output  units.  For  characters,  120  input  units, 
and  two  hidden  layers  with  65  and  39  hidden  units, 
26  output  units . 

HARDWARE:  SUN-4 


TRAINING: 

DIGITS 

UPPERS 

LOWERS  DATABASE 

61000 

40300 

36400  NSDB3 

STATUS : 

on  time 

, submitted  as 

ERIM. 

.0 

RESULTS:  — DIGITS  — 

— UPPERS  — 

— LOWERS  — DATABASE 

REJ. 

ERR. 

REJ. 

ERR. 

REJ. 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE 

RATE— 

RATE 

RATE— 

0.00 

0.0388 

0.00 

0.0518 

0.00 

0.1379 

0.10 

0.0082 

0.10 

0.0180 

0.10 

0.0897 

0.20 

0.0025 

0.20 

0.0072 

0.20 

0.0554 

0.30 

0.0012 

0.30 

0.0041 

0.30 

0.0368 

0.40 

0.0009 

0.40 

0.0024 

0.40 

0.0214 

0.50 

0.0007 

0.50 

0.0020 

0.50 

0.0118 

OCR  RATE  (CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

0.24 

0.24 

0.24 

CPU  RATE: 

0.91 

0.91 

0.91 
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SYSTEM:  ERIM.l 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 
[11][12][13][14] 
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NUMBER  WRITERS  WITH  ERROR  <3  E 


hi 

H 

S 

oc 

o 

CC. 

a. 

UJ 


ERIM  1 — DIGITS  UPPERS  LOWERS 


REJECTION  RATE  (%) 


Figure  61:  Error  rate  versus  rejection  rate  for  ERIM.l 


ERM_1 


Figure  62:  Error  rate  per  writer  of  ERIM.l 
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EllM.I.OMa.COlWELATC 


SVSTQf  NUMBER 


Figure  63:  ERIM_1  - digit  correlation 


Sy*iem  Number 

System  Name 

CorreUlion  ( aII  ) 

Correlation  (correct) 

1 

ERlM.l 

I 0000 

1.0000 

2 

VOTE^ 

0 9726 

0,9661 

3 

AEG 

0 9674 

0,9492 

4 

OCRSYS 

0 9619 

0 9660 

S 

VOTE  J* 

0 9613 

0 9494 

6 

REFERENCE 

0 9612 

0 9612 

7 

ERIMJ 

0.9687 

0 9433 

8 

ATT  J 

0.9680 

0 9440 

9 

ELSAGB-3 

0 9578 

0 9461 

10 

ATT-4 

0 9674 

0.9416 

11 

ELSAGB J 

0 9672 

0 9447 

12 

KODAK J 

0 9666 

0 9410 

13 

ATT-1 

0 9663 

0 9446 

14 

IBM 

0.9649 

0-9432 

16 

UBOL 

0 9636 

0,9382 

16 

KODAKJ 

0 9604 

0,9349 

17 

NESTOR 

0 9497 

0 9361 

18 

NIST.4 

0 9496 

0 9334 

19 

HUGHES. 1 

0.9496 

0.9346 

20 

ELSAGB.l 

0.9496 

0 9331 

21 

SYMBUS 

0 9494 

0-9360 

22 

HUGHES. 2 

0 9491 

0 9344 

23 

ATT  J 

0.9486 

0 9346 

24 

THINK.2 

0.9477 

0 9376 

26 

THINK.l 

0 9463 

0 9316 

26 

REI 

0 9436 

0 9366 

27 

NYNEX 

0 9421 

0 9331 

28 

GTESS.l 

0 9334 

0,9182 

29 

GTESS J 

0 9333 

0-9176 

30 

COMCOM 

0 9279 

0.9249 

31 

NIST.l 

0.9199 

0,9067 

32 

GMD.3 

0.9186 

0 9036 

33 

MIME 

0 9160 

0 8992 

34 

ASOL 

0 9134 

0 8972 

36 

NISTJ 

0 9117 

0 8964 

36 

UPENN 

0.9110 

0 8962 

37 

GMD.l 

0.9106 

0 8968 

38 

NIST-3 

0 9078 

0 8910 

39 

RISC 

0 8990 

0.8818 

40 

GMD.l 

0 8962 

0 8821 

41 

KAMAN-l 

0 8930 

0.8746 

42 

KAMANJJ 

0 8764 

0 8683 

43 

KAMAN_2 

0 8742 

0 8661 

44 

KAMAN.6 

0 8621 

0 8367 

46 

GMD.2 

0.8604 

0 8347 

46 

VALENJ 

0 8429 

0 8301 

47 

IFAX 

0.8340 

0.8186 

48 

VALEN.l 

0.8217 

0 8073 

49 

KAMAN.4 

0 8016 

0.7844 

Table  41:  ERIM.l  correlation  graph  key  for  digits. 
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EfW_1.UPPEaCORRELATE 


SYSTEM  NUMER 

Figure  64:  ERIM.l  - upper  case  correlation 


System  Number 

System  Name 

Correlation  ( ail ) 

Correlation  (correct) 

1 

EMM.i 

1 0000 

1.0000 

2 

VOTE-M 

0 9639 

0 9379 

3 

REFERENCE 

0 9482 

0 9482 

4 

AEG 

0.9477 

0 9323 

6 

ATT.4 

0 9379 

0 9219 

6 

UMICH.l 

0 9323 

0.9181 

7 

NYNEX 

0 9306 

0 9182 

8 

ATT  J 

0.9290 

0.9143 

9 

UBOL 

0 9290 

0.9116 

10 

VOTEJ* 

0 9278 

0.9173 

11 

NESTOR 

0 9266 

0.9117 

12 

HUGHES. 1 

0 9262 

0 9083 

13 

IBM 

0 9246 

0 9090 

14 

KODAKU 

0 9237 

0.9066 

16 

HUGHES.2 

0 9237 

0 9063 

16 

ATT  J 

0 9223 

0 9064 

17 

ATT  J 

0 9216 

0 9066 

IS 

SYMBUS 

0 9200 

0 9029 

19 

GTESS.l 

0 9101 

0 8960 

20 

GTESS J 

0 9101 

0 8942 

21 

OCRSYS 

0 9080 

0 8972 

22 

MIME 

0 8911 

0.8760 

23 

NIST.4 

0 8866 

0 8706 

24 

ASOL 

0 8789 

0 8639 

26 

REI 

0 8676 

0 8662 

26 

RISC 

0 8662 

0 8390 

27 

GMD.l 

0 8638 

0 8376 

28 

GMD.3 

0 8619 

0 8368 

29 

NIST.l 

0 8616 

0 8371 

30 

KAMAN.l 

0 8434 

0 8279 

31 

GMD.4 

0 8341 

0 8192 

32 

NISTJ 

0 8179 

0.8072 

33 

COMCOM 

0 8132 

0.8071 

34 

KAMAN.3 

0 7983 

0.7833 

36 

IFAX 

0 7949 

0.7814 

36 

KAMAN.2 

0 7901 

0 7740 

37 

NISTJ 

0 7640 

0 7519 

38 

GMD-2 

0 7629 

0 7377 

39 

VALEN.I 

0 7613 

0.7372 

40 

KAMAN.4 

0 7248 

0.7097 

41 

KAMAN^ 

0 6666 

0 6441 

42 

UMICHJ 

0 0390 

0 0219 

Table  42:  ERIM_1  correlation  graph  key  for  uppers. 
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EHII.I  XOWERCORflELATE 


SYSTEM  NUMBER 

Figure  65:  ERIM_1  - lower  case  correlation 


System  Number 

System  N*me 

Correlation  ( all ) 

Correlation  (correct) 

ErTvO 

1 0000 

1.0000 

2 

VOTEJVl 

0 8993 

0 8441 

3 

AEG 

0 8644 

0,8112 

4 

REFERENCE 

0.8621 

0 8621 

S 

UBOL 

0 *471 

0-7914 

6 

ATTJ 

0.8464 

0,7899 

7 

KODAK_l 

0 8446 

0 7961 

8 

IBM 

0 8443 

0 7932 

9 

NYNEX 

0.8429 

0 7985 

10 

OCRSYS 

0 8417 

0.7969 

L 1 

ATTJ 

0 8405 

0 7985 

12 

HUGHES. 1 

0 8404 

0 7893 

13 

HUGHES.2 

0 8399 

0 7882 

14 

UMICH.l 

0 8378 

0 7885 

1& 

ATTJ 

0 8375 

0-7938 

16 

ATTX 

0 8374 

0 7934 

17 

NESTOR 

0 8351 

0.7889 

18 

VOTEJ” 

0 8308 

0.7999 

19 

GTESS.l 

0 8167 

0.7691 

20 

GTESS.2 

0 8125 

0.7641 

21 

NIST.4 

0 8052 

0.7508 

22 

NIST-l 

0 8030 

0.7584 

23 

GMD  J 

0.7855 

0,7403 

24 

RISC 

0.7838 

0.7358 

25 

ASOL 

0 7757 

0.7345 

26 

GMD.4 

0 7693 

0 7247 

27 

GMD.l 

0 7693 

0.7247 

28 

NISTJ 

0 7S21 

0.7307 

29 

GMD  J 

0-7135 

0 6723 

30 

KAMAN.l 

0.6892 

0 6481 

31 

NIST  J 

0 6877 

0 6485 

32 

VALEN.l 

0 6859 

0 6423 

33 

KAMANJ 

0 6668 

0 6262 

34 

KAMAN.2 

0 6482 

0 6115 

35 

KAMAN.S 

0 5763 

0 5423 

36 

KAMAN.4 

0 5356 

0 5058 

37 

COMCOM 

0 5061 

0 4923 

38 

UMICHJ 

0 0910 

0 0539 

Table  43:  ERIM.l  correlation  graph  key  for  lowers. 


138 


SYSTEM:  ERIM_2 


PARTICIPANT:  Steven  Schlosser 

ORGANIZATION:  Environmental  Research  Institute  of  Michigan  (ERIM) 

Ann  Arbor,  Michigan 

PREPROCESSING:  filtering,  size  and  slant  normalization 

FEATURES:  morphological  (cavities)  auid  stroke  features,  whole  digits  for 
template  matching 

CLASSIFICATION:  four  layer  NN  auid  template  matcher  combined  together  for 
single  result 


HARDWARE:  SUN-4 

TRAINING:  DIGITS 

UPPERS 

LOWERS 

DATABASE 

none 

NA 

NA 

STATUS : 

on  t ime 

, submitted  as 

ERIM.l 

RESULTS : 

“ DIGITS  — 

— UPPERS  — 

— LOWERS  — 

DATABASE 

REJ.  ERR. 

RATE  RATE— 

REJ.  ERR. 

RATE  RATE— 

REJ.  ERR. 

RATE  RATE— 

TESTDATAl 

0.00  0.0392 

0.10  0.0099 

0.20  0.0033 

0.30  0.0013 

0.40  0.0007 

0.50  0.0006 

OCR  RATE  (CPS) : DIGITS  UPPERS  LOWERS 

SYS  RATE:  10.0  NA  NA 

CPU  RATE: 
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SYSTEM:  ERIM.2 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 
[11][12][13][14] 


140 


ERROR  RATE  (%) 


ERIM  2 — DIGITS 


REJECTION  RATE  (%) 


Figure  66:  Error  rate  versus  rejection  rate  for  ERIM_2 
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Figure  67:  Error  rate  per  writer  of  ERIM_2 
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Ei«M_2.0KrT.COIWELATE 


SYSTEM  NUMBER 

Figure  68:  ERIM_2  - digit  correlation 


System  N u m ber 

System  Nsme 

Correifttion  ( ) 

Correlation  (correct) 

1 

ERIM^ 

l.OOOO 

1,0000 

2 

VOTE31 

0 9677 

0,9537 

3 

AEG 

0 9616 

0 9465 

4 

REFERENCE 

0 9608 

0 9608 

5 

OCRSYS 

0 9608 

0 9542 

6 

ERIM.l 

0 9587 

0 9433 

7 

VOTEJ* 

0 9576 

0 9472 

8 

ATTJ 

0 9557 

0 9428 

9 

ELSAGB^ 

0 9543 

0.9433 

10 

KODAK-2 

0 9539 

0 9306 

11 

ELS AGB J 

0 9535 

0 9428 

12 

ATTa 

0 9534 

0 9433 

13 

ATT.4 

0 9531 

0 9391 

14 

IBM 

0 9523 

0 9417 

13 

UBOL 

0.9487 

0 9359 

16 

SYMBUS 

0 9479 

0.9345 

17 

NIST.4 

0 94  79 

0,9323 

18 

KODAK.: 

0 9476 

0 9335 

19 

THINK-2 

0 9474 

0.9375 

20 

HUGHES. 1 

0 9473 

0 9329 

21 

HUGHES.2 

0.9469 

0 9328 

22 

NESTOR 

0 9455 

0.9339 

23 

ELSAGB-1 

0 9454 

0.9312 

24 

ATTJl 

0 9450 

0 9326 

25 

REI 

0 9439 

0.9357 

26 

THINK.l 

0.9429 

0.9306 

27 

NYNEX 

0,9417 

0 9325 

28 

GTESS.l 

0 9283 

0 9155 

29 

GTESS_2 

0 9280 

0 9147 

30 

COMCOM 

0,9278 

0.9250 

31 

NIST.1 

0.9L82 

0.9049 

32 

GMD.3 

0,9170 

0 9026 

33 

MIME 

0.91 1 1 

0.8976 

34 

ASOL 

0 9100 

0.8954 

35 

UPENN 

0.9094 

0.8945 

36 

GMD.l 

0 9093 

0 8960 

37 

NISTJ 

0 9080 

0.8936 

38 

NISTJ 

0 9031 

0 8885 

39 

RISC 

0 8974 

0 8812 

40 

GMD.4 

0 8942 

0 8815 

41 

KAMAN.l 

0 8896 

0 8735 

42 

K AMAN.J 

0 8733 

0.8572 

43 

KAMAN.2 

0.8701 

0.8545 

44 

GMD.2 

0.8483 

0-8341 

45 

KAMAN.5 

0 8482 

0 8349 

46 

VALEN.2 

0.8379 

0.8275 

47 

IFAX 

0 8312 

0 8173 

48 

VALEN.l 

0.8218 

0 8077 

49 

KAMAN.4 

0 7976 

0.7825 

Table  44:  ERIM_2  correlation  graph  key  for  digits. 
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No  Data  Available 


Figure  69:  ERIM_2  - upper  case  correlation 


There  no  d*t«  for  thi«  ev»lu4iion 


Table  45:  ERIM_2  correlation  graph  key  for  uppers. 
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No  Data  Available 


Figure  70:  ERIM_2  - lower  case  correlation 


There  no  d&tA  for  thi#  ev^u4lion. 


Table  46:  ERIM_2  correlation  graph  key  for  lowers. 
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SYSTEM:  GMD.l 


PARTICIPANT:  Frank  Smieja 

ORGANIZATION:  Gesellshcaft  fuer  Mathematik  und  Datenverarbeitung 
(GMD) , Sankt  Augustin,  Germany 

PREPROCESSING:  size  normalization  to  16x24 

FEATURES:  genetically  optimized  polynomial  filter,  384  features  extracted. 
Feature  optimization  by  PGA  (p2urallel  genetic  algorithm) . 

CLASSIFICATION:  statistical,  nearest  neighbor 


HARDWARE : 

SPARC2 

TRAINING: 

DIGITS 

UPPERS 

LOWERS 

DATABASE 

15000 

22000 

22000 

NSDB3 

STATUS : 

on  time 

RESULTS:  --  DIGITS  --  — UPPERS  — — LOWERS  — DATABASE 


REJ. 

ERR. 

REJ. 

ERR.  REJ. 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE 

RATE—  RATE 

RATE— 

0.00 

0.0873 

0.00 

0.1404  0.00 

0.2254 

0.15 

0.0272 

0.16 

0.0625  0.31 

0.0809 

OCR  RATE  (CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

1.18 

0.48 

0.40 

CPU  RATE: 
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SYSTEM:  GMD_1 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 

[15][16] 

COMMENTS:  GMD.1,3,4 
PARTICIPANT:  Frank  Smieja 

ORGANIZATION:  Gesellschaft  fiir  Mathematik  und  Datenverarbeitung  (GMD),  Sankt  Augustin, 
Germany. 

The  algorithm  works  in  several  steps. 

1.  Normalization  of  the  image  to  16x24  pixels. 

2.  From  a training  set,  64  features  are  computed  by  Karhunen-Loeve  transformation. 

3.  Distance  and  variance  of  the  clusters  are  optimized  by  the  genetic  algorithm. 

Future  developments: 

• Reduction  of  the  training  set  required  to  be  stored. 

• Employment  of  geometric  learning. 
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ERROR  RATE  (») 


OID  1 — DIGITS  UPPERS  LOWERS 


REJECTION  RATE  (») 


Figure  71:  Error  rate  versus  rejection  rate  for  GMD_1 
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Figure  72:  Error  rate  per  writer  of  GMD.l 
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GUO.I.nCrT.CORRELATE 


SYSTEM  NUMBER 

Figure  73:  GMD_1  - digit  correlation 


Syatem  Number 

System  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

GMD.l 

1 0000 

1 0000 

2 

GMD.4 

0.9786 

0 8954 

3 

GMDJ 

0 9712 

0 9027 

4 

VOTE-M 

0 9227 

0 9079 

5 

AEG 

0 9152 

0-9005 

6 

VOTEJ* 

0 9146 

0 9040 

7 

NIST.4 

0.9138 

0 8932 

8 

REFERENCE 

0 9127 

0 9127 

9 

ELSAGB^ 

0 9120 

0.8993 

10 

ELSAGB J 

0.91 18 

0 8990 

1 1 

OCRSYS 

0 9114 

0.9060 

12 

ERIM.l 

0 9106 

0 8968 

13 

KODAK.2 

0 9096 

0 8953 

14 

ERIM  J 

0 9093 

0 8960 

13 

THINK. 1 

0 9093 

0 8916 

16 

ATTa 

0 9091 

0 8985 

17 

UBOL 

0 9091 

0 8935 

18 

ATT.4 

0 9090 

0 8949 

19 

ATTJ 

0 9082 

0 8967 

20 

IBM 

0 9075 

0 8967 

21 

SYMBUS 

0 9058 

0 8915 

22 

ATT  J 

0 9056 

0 891 1 

23 

ELSAGB.l 

0 9056 

0 8894 

24 

KODAK a 

0 9052 

0 8904 

25 

NESTOR 

0 9044 

0 8916 

26 

THINK.2 

0 9018 

0 8920 

27 

HUGHES. 1 

0 9010 

0 8881 

28 

HUGHES.2 

0 9004 

0 8879 

29 

REI 

0 8984 

0 8906 

30 

NYNEX 

0 8982 

0.8883 

31 

GTESS  J 

0 8931 

0 8761 

32 

GTESS.l 

0 8924 

0 8764 

33 

NIST.l 

0 8913 

0.8709 

34 

ASOL 

0 8821 

0 8615 

35 

COMCOM 

0 8818 

0 8789 

36 

MIME 

0 8800 

0 8620 

37 

RISO 

0 8800 

0.8526 

38 

NISTJ 

0 8786 

0 8596 

39 

NIST  J 

0 8759 

0 8556 

40 

UPENN 

0 8727 

0 8558 

41 

KAMAN.l 

0 8659 

0 8425 

42 

K AMAN_3 

0 8520 

0 8280 

43 

K AMAN_2 

0 8498 

0.8257 

44 

KAMAN.S 

0 8297 

0.8074 

45 

GMD.2 

0 8291 

0 8072 

46 

VALEN.2 

0 8069 

0 7934 

47 

IFAX 

0.8059 

0.7863 

48 

VALEN.l 

0.8002 

0 7782 

49 

KAMAN.4 

0.7832 

0 7585 

Table  47:  GMD.l  correlation  graph  key  for  digits. 
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QHO_1.UPP€fLCORflELATE 


SYSTEM  NUMBER 

Figure  74:  GMD_1  - upper  case  correlation 


S yiiem  Nu m ber 

System  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

GMD-l 

1 0000 

l.OOOO 

2 

GMD  J 

0,9679 

0.8446 

3 

GMD_4 

0 9595 

0.8318 

4 

VOTEJH 

0.8715 

0.8549 

b 

AEG 

0 8601 

0.8467 

6 

REFERENCE 

0 8596 

0-8596 

7 

ATT.4 

0,8587 

0 8421 

8 

VOTE 

0 8570 

0.8461 

9 

UMICH-1 

0.8551 

0.8390 

10 

ERIM.l 

0.8538 

0,8375 

U 

UBOL 

0 8510 

0 8324 

12 

NESTOR 

0 8508 

0 8350 

13 

ATT-3 

0 8493 

0 8305 

14 

KODAKJ 

0 8491 

0 8304 

15 

ATTJ 

0 8488 

0,8349 

16 

ATTJ 

0.8485 

0 8295 

17 

NYNEX 

0 8478 

0 8359 

18 

IBM 

0 8470 

0 8308 

19 

NIST.4 

0 8464 

0 8135 

20 

SYMBUS 

0 8460 

0 8253 

21 

HUGHES. 1 

0 8442 

0 8272 

22 

HUGHES.2 

0 8440 

0,8261 

23 

GTESS.l 

0 8350 

0 8186 

24 

GTESS  J 

0 8345 

0.8178 

25 

MIME 

0 8341 

0 8095 

26 

OCRSYS 

0 8267 

0.8163 

27 

RISC 

0 8241 

0.7899 

28 

NIST.l 

0 8239 

0 7892 

29 

ASOL 

0 8235 

0 8004 

30 

REI 

0 8037 

0 7870 

31 

KAMAN.l 

0 7949 

0.7708 

32 

NISTJ 

0 T719 

0-7537 

33 

KAMAN.3 

0 7597 

0 7339 

34 

KAMAN.2 

0 7555 

0 7271 

35 

COMCOM 

0 7462 

0 7402 

36 

IFAX 

0 7392 

0 7198 

37 

NIST-2 

0 7313 

0 7063 

38 

GMD.2 

0 7245 

0.6968 

39 

valen.i 

0.7152 

0 6886 

40 

KAMAN.4 

0 7012 

0 6710 

41 

KAMAN_5 

0 6319 

0 6073 

42 

UMICH^ 

0 0528 

0 0154 

Table  48:  GMD_1  correlation  graph  key  for  uppers. 
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QMO.I  XOWERCOfWELATE 


SYSTEM  NUMBER 

Figure  75:  GMD_1  - lower  case  correlation 


System  N um ber 

System  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

GMD.l 

1.0000 

1 0000 

2 

GMD-4 

1 0000 

0 7746 

Z 

GMD  J 

0 9S22 

0 7613 

i 

VOTEJV4 

0.8117 

0 7591 

5 

REFERENCE 

0 7746 

0 7746 

6 

ATT.4 

0 7718 

0 7232 

7 

ERIM-1 

0 7693 

0 7247 

8 

KODAKJ 

0.7673 

0 7199 

9 

AEG 

0 7668 

0 7235 

10 

UBOL 

0 7644 

0 7122 

1 1 

ATT  J 

0 7623 

0 7176 

12 

VOTE-F 

0 7602 

0.7325 

13 

ATTJ 

0 7S97 

0.7193 

M 

NYNEX 

0 7S82 

0 7182 

IS 

ATT-3 

0 7S74 

0 7100 

16 

UMICH.l 

0 7S71 

0 7117 

17 

NIST.l 

0 7S42 

0 6981 

18 

NIST.4 

0 7541 

0 6915 

19 

IBM 

0 7489 

0.7088 

20 

NESTOR 

0 7485 

0 7096 

21 

OCRSYS 

0 7473 

0.7110 

22 

HUGHES. 1 

0 7472 

0 7057 

23 

HUGHES.2 

0.7459 

0.7040 

24 

GTESS.l 

0 7362 

0 6931 

2S 

GTESS.2 

0.7327 

0 6888 

26 

RISO 

0 7285 

0 6765 

27 

ASOL 

0 7146 

0 6706 

28 

NIST  J 

0 6963 

0 6709 

29 

GMD.2 

0 6692 

0 6226 

30 

KAMAN.l 

0 6470 

0 5987 

31 

valen.1 

0 6428 

0 5945 

32 

NISTJ 

0 6395 

0 5954 

33 

KAMAN.3 

0 6270 

0 5795 

34 

KAMAN-2 

0 6148 

0 5684 

3S 

KAMAN.S 

0 5438 

0 5020 

36 

K AMAN.4 

0 5192 

0.4757 

37 

COMCOM 

0 4622 

0 4613 

38 

UMICH-2 

0.1098 

0.0478 

Table  49:  GMD_1  correlation  graph  key  for  lowers. 
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SYSTEM:  GMD_2 


PARTICIPANT:  Frank  Smieja 

ORGANIZATION:  Gesellschaft  fuer  Mathematik  und  Datenverarbeitung 
(GMD) , Sankt  Augustin,  Germany 

PREPROCESSING:  scaled,  centered,  contrast  filtered  images? 

FEATURES:  pixel  representation  only 


CLASSIFICATION:  network  of  worker,  monitor,  and  decision  NN  in 

pandemonium  system  of  MINOS  modules,  "pandemonium 
reflective  system" . Output  is  chosen  from  network 
with  maximum  confidence  value  by  decision  NN. 


HARDWARE: 

SPARC2 

TRAINING: 

DIGITS 

UPPERS 

LOWERS  DATABASE 

4180 

8979 

9355  NSDB3 

"420 

"420 

"420  writers 

STATUS : 

on  time 

RESULTS:  — 

DIGITS  — 

— UPPERS  — — 

LOWERS  — DATABASE 

REJ. 

ERR. 

REJ. 

ERR. 

REJ. 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE 

RATE— 

RATE 

RATE— 

0.00 

0.1545 

0.00 

0 . 2457 

0.00 

0.2861 

0.10 

0.1120 

0.28 

0.1321 

0.28 

0.1752 

0.12 

0.1023 

0.31 

0.1219 

0.31 

0.1625 

0.13 

0.0979 

0.33 

0.1172 

0.14 

0.0941 

0.34 

0.1132 

C.16 

0.0904 

0.36 

0.1080 

0.17 

0.0855 

0.38 

0.1022 

OCR  RATE  (CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

0.68 

0.43 

0.32 

CPU  RATE: 
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SYSTEM:  GMD_2 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 
[17][18] 

COMMENTS:  GMD_2 


PARTICIPANT:  Frank  ^mieja 


ORGANIZATION:  Gesellschaft  fiir  Mathematik  und  Datenverarbeitung  (GMD),  Sankt  Augustin, 
Germany. 


The  data  was  learnt  by  a system  of  modular  neural  networks,  described  in  the  reports  cited  below. 
The  individual  patterns  to  be  learnt  are  automatically  decomposed  over  the  modular  system,  such 
that  the  Worker  neural  network  that  learns  to  map  a particular  pattern  to  a target  is  the  one  that 
is  most  specialized  at  that  time  to  learn  it.  In  order  that  the  appropriate  network  can  be  beheved, 
when  a test  session  is  in  process,  a partner  Monitor  network  is  employed.  The  Monitor  network 
partnered  to  the  Worker  network  allocated  the  pattern  to  learn  is  trained  to  produce  a positive 
output  when  it  sees  the  pattern.  The  other  Monitor  networks,  associated  with  Workers  that  do  not 
learn  the  current  pattern,  are  trained  to  produce  a negative  output  on  seeing  this  pattern. 

Various  confidence  values  are  derived  from  the  outputs  from  the  Monitor  networks  during  the  test 
sessions.  An  ambiguity  measure  is  also  derived  from  degree  of  closeness  of  the  two  most  positive 
Monitor  outputs.  Both  the  confidences  and  the  ambiguity  are  then  used  to  filter  off  the  answers 
that  are  not  provided  with  sufficient  commitment  (the  “rejected”  patterns). 

Insofar  as  the  NIST  test  results  are  concerned,  it  was  observed  that  the  training  on  so  few  examples 
(see  above)  was  quite  a disadvantage.  The  system  was  able  to  model  the  NIST  training  set  well 
enough  to  produce  good  generalization  for  this  set,  but  in  general  10  worse  for  the  NIST  test  sets. 

BIBLIOGRAPHY: 

F.  J.  ^rnieja.  Multiple  network  systems  (MINOS)  modules:  task  division  and  module  discrimination, 
Proc.  8th  AISB  conference  on  Artificial  Intelligence,  Leeds,  April  1991. 

F.  J ^mieja  and  H.  Miihlenbein,  Reflective  modular  neural  networks,  submitted  to  Machine  Learn- 
ing, available  as  GMD  report  number  633  (1992). 
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NUMBER  WRITERS  WITN  ERROR  E 


GMD  2 — DIGITS  UPPERS  LOWERS 


Figure  76;  Error  rate  versus  rejection  rate  for  GMD_2 
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Figure  77:  Error  rate  per  writer  of  GMD-2 
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aMO_2.iMarr.coiwELATC 


SYSTEM  NUMBER 


Figure  78:  GMD_2  - digit  correlation 


Sy«iem  Number 

System  N%me 

Correlation  (all) 

Correlation  (correct) 

1 

GMDJ 

1 0000 

l.OOOO 

2 

VOTEJfcl 

0 8676 

0 8426 

3 

ATT.4 

0 8636 

0 8363 

4 

KODAKS 

0 8630 

0 8366 

S 

ATT^ 

0 8626 

0.8370 

e 

AEG 

0 8626 

0 8369 

7 

RISC 

0 8626 

0 8110 

g 

VOTEJ* 

0 8609 

0.8402 

9 

ERIM.l 

0 8604 

0 8347 

10 

ATT  J 

0 8604 

0.8325 

11 

NIST.4 

0 8604 

0 8306 

12 

THINKS 

0 8602 

0 8312 

13 

NISTJ 

0 8497 

0 8162 

14 

SYMBUS 

0 8496 

0 8320 

16 

KODAK-1 

0 8494 

0.8319 

16 

ERIM.2 

0.8483 

0.8341 

17 

ELSAGB.2 

0 8482 

0 8366 

18 

NISTJ 

0.8482 

0 8126 

19 

ELSAGB_2 

0 84  79 

0.8364 

20 

ELSAGB.l 

0 84  70 

0 8291 

21 

IBM 

0 8469 

0 8349 

22 

NESTOR 

0 8466 

0 8316 

23 

UBOL 

0 8468 

0 8307 

24 

OCRSYS 

0 8467 

0 8402 

26 

ATTa 

0,8467 

0 8360 

26 

GTESS-2 

0 8467 

0 8226 

27 

REFERENCE 

0 8466 

0-8466 

28 

GTESS-1 

0 8460 

0 8226 

29 

HUGHES-2 

0 8421 

0 8278 

30 

HUGHES-1 

0 8416 

0 8276 

31 

NIST-1 

0 8406 

0 8166 

32 

KAMAN-1 

0 8396 

0 8024 

33 

ASOL 

0 8392 

0.81 16 

34 

MIME 

0 8391 

0 8124 

36 

NYNEX 

0 8389 

0.8277 

36 

THINK-2 

0 8387 

0.8289 

37 

GMD.3 

0 8381 

0 8138 

38 

REI 

0 8362 

0 8280 

39 

GMD-1 

0 8291 

0 8072 

40 

UPENN 

0 8268 

0 8031 

41 

KAMAN-3 

0 8268 

0.7887 

42 

KAMAN.2 

0 8249 

0 7868 

43 

COMCOM 

0 8183 

0 8169 

44 

GMD-4 

0 8166 

0.7948 

46 

KAMAN-5 

0.7976 

0.7664 

46 

KAMAN-4 

0.7727 

0.7280 

47 

VALEN-1 

0-7669 

0.7377 

48 

IFAX 

0,7631 

0.7403 

49 

VALEN-2 

0.7694 

0 7438 

Table  50:  GMD_2  correlation  graph  key  for  digits. 
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(aiO_2.UPPEILCORRELATE 


SYSTEM  NUMBEn 


Figure  79:  GMD_2  - upper  case  correlation 


Syiiem  Number 

System  Name 

Correi&lion  ( ail ) 

Correlation  (correct) 

GMD.2 

1 0000 

1 0000 

2 

RISC 

0 7710 

0 7165 

3 

ATT-4 

0 7671 

0.7452 

i 

VOTE31 

0 7649 

0.7493 

S 

KODAKU 

0 7S91 

0 7364 

6 

SYMBUS 

0 7S79 

0 7344 

7 

MIME 

0.7S61 

0.7236 

S 

ATT.2 

0 7S46 

0 7376 

9 

REFERENCE 

0.7S43 

0,7543 

10 

AEG 

0 7S42 

0 7424 

11 

VOTE 

0 7S32 

0 7442 

12 

ERIM-l 

0 7S29 

0 7377 

13 

IBM 

0 7S1 1 

0 7334 

14 

ATTJ 

0 7504 

0 7322 

IS 

NESTOR 

0 7494 

0 7343 

16 

UMICH.l 

0 7488 

0.7356 

17 

UBOL 

0 7488 

0.7311 

18 

GTESS.l 

0 7459 

0,7253 

19 

NYNEX 

0.7457 

0,7344 

20 

HUGHES. 1 

0 7457 

0.7283 

21 

HUGHES-2 

0 7454 

0.7277 

22 

ATTJ 

0 7450 

0 7291 

23 

GTESS J 

0.7450 

0.7239 

24 

ASOL 

0 7438 

0.7148 

2S 

NIST.4 

0 7363 

0 7123 

26 

NIST.l 

0.7304 

0 6986 

27 

OCRSYS 

0.7287 

0 7187 

28 

NIST  J 

0 7266 

0 6895 

29 

GMD.l 

0 7245 

0 6968 

30 

GMD.3 

0 7226 

0 6947 

31 

KAMAN.l 

0 7225 

0 6917 

32 

REI 

0 7154 

0 6975 

33 

NIST.2 

0 7063 

0 6569 

34 

GMD.4 

0 7051 

0,6796 

3S 

KAMAN.3 

0 6963 

0 6633 

36 

KAMAN.2 

0 6954 

0 6586 

37 

IFAX 

0 6686 

0 6429 

38 

COMCOM 

0 6618 

0 6569 

39 

KAMAN.4 

0 6543 

0 6134 

40 

valen.i 

0 6540 

0 6205 

41 

KAMAN.S 

0 5801 

0.5490 

42 

UMICH-2 

0 0805 

0 0143 

Table  51:  GMD_2  correlation  graph  key  for  uppers. 
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QMO_2J.OWERCOIWELATE 


Figure  80:  GMD_2  - lower  Ccise  correlation 


System  Number 

System  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

GMD-2 

1 0000 

1 0000 

2 

VOTE-M 

0.7470 

0 7013 

3 

ATT.1 

0 7287 

0.6779 

4 

RISC 

0.7263 

0 6307 

5 

ATTJ2 

0 7219 

0.6758 

6 

REFERENCE 

0 7139 

0 7139 

7 

ERIM.l 

0 7133 

0 6723 

8 

ATT  J 

0 7100 

0 6637 

9 

IBM 

0 7097 

0 6646 

10 

KODAK J 

0.7081 

0 6662 

11 

VOTEJ* 

0 7068 

0 6801 

12 

NYNEX 

0 7033 

0 6671 

13 

NIST.l 

0 7031 

0 6313 

14 

UMICH.l 

0 7022 

0 6610 

13 

OCRSYS 

0 7021 

0 6642 

16 

GTESS.l 

0 7016 

0.6337 

1 7 

ATT  J 

0 7010 

0.6637 

18 

AEG 

0 7000 

0 6639 

19 

UBOL 

0 6998 

0 6368 

20 

GTESS  J 

0 6992 

0 6307 

21 

NESTOR 

0 6977 

0 6398 

22 

HUGHES. 1 

0 6917 

0 6336 

23 

HUGHES. 2 

0 6889 

0 631 1 

24 

ASOL 

0 6832 

0.6344 

23 

NIST  J 

0 6830 

0 6412 

26 

GMD.3 

0 6822 

0 6339 

27 

NIST.l 

0 6798 

0.6330 

28 

GMD.4 

0 6692 

0 6226 

29 

GMD.l 

0 6692 

0.6226 

30 

NIST.2 

0 6313 

0 3823 

31 

KAMAN.l 

0 6262 

0.3713 

32 

K AMAN.3 

0 6093 

0 3348 

33 

VALEN.1 

0 6081 

0 3383 

34 

KAMAN.2 

0 3921 

0 3398 

33 

K AMAN_3 

0 3239 

0 4779 

36 

KAMAN-4 

0 3033 

0 4327 

37 

COMCOM 

0 4342 

0 4247 

38 

UMICH.2 

0.1047 

0.0412 

Table  52:  GMD_2  correlation  graph  key  for  lowers. 
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SYSTEM;  GMD.3 


PARTICIPANT:  Frank  Smieja 

ORGANIZATION:  Gesellschaft  fuer  Mathematik  und  Datenverarbeitung 
(GMD) , Sankt  Augustin,  Germany 


FEATURES:  genetically  optimized  polynomial  filter 
CLASSIFICATION:  statistical 
HARDWARE:  SPARC2 

TRAINING:  DIGITS  UPPERS  LOWERS  DATABASE 

15000  22000  22000  NSDB3 

STATUS:  on  time 

RESULTS:  — DIGITS  — — UPPERS  — — LOWERS  — DATABASE 


REJ. 

ERR. 

REJ. 

ERR. 

REJ. 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE 

RATE— 

RATE 

RATE— 

0.00 

0.0813 

0.00 

0.1422 

0.00 

0.2085 

0.07 

0 . 0479 

0.15 

0.0700 

0.19 

0.1187 

OCR  RATE  (CPS) ; 

DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

1.18 

0.48 

0.40 

CPU  RATE: 
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SYSTEM:  GMDJ} 

BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 
[15][16] 
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ERROR  RATE  (%) 


GMD  3 — DIGITS  UPPERS  LOWERS 


REJECTION  RATE  (%) 


Figure  81:  Error  rate  versus  rejection  rate  for  GMD_3 


QMO_3 


Figure  82:  Error  rate  per  writer  of  GMD_3 
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GMO_a.O(QIT.COnRELATE 


SYSTEM  NUMBER 

Figure  83:  GMD_3  - digit  correlation 


System  Number 

System  Name 

Correlation  ( all ) 

Correlation  (correct) 

GMD-3 

1.0000 

1 0000 

2 

GMD.l 

0.9712 

0 9027 

3 

GMD.4 

0 9498 

0 8864 

4 

VOTE^ 

0.9308 

0.9146 

5 

AEG 

0.9239 

0 9076 

6 

NIST.4 

0 9226 

0 9001 

7 

VOTEJ= 

0 9224 

0.9109 

S 

ELS  AGB.J 

0 9198 

0 9069 

9 

ELSAGB J 

0 9194 

0 9066 

10 

REFERENCE 

0 9187 

0 9187 

11 

ERIM.l 

0 9186 

0 9036 

12 

OCRSYS 

0 9181 

0 9122 

13 

KODAK-2 

0 9172 

0 9019 

14 

ERIM-2 

0 9170 

0 9026 

IS 

ATT.4 

0 9170 

0 9016 

16 

THINK-1 

0.9167 

0.8979 

17 

UBOL 

0,9166 

0 8999 

16 

ATT.2 

0 9161 

0 9033 

19 

ATTJ 

0.9166 

0.9044 

20 

IBM 

0 9141 

0 9029 

21 

SYMBUS 

0 9136 

0.8981 

22 

ATT.3 

0.9133 

0.8977 

23 

ELSAGB.l 

0 9133 

0.8968 

24 

KODAK J 

0 9130 

0 8973 

26 

NESTOR 

0 9122 

0 8982 

26 

HUGHES. 1 

0.9082 

0 8947 

27 

THINK.2 

0 9081 

0 8980 

28 

HUGHES.2 

0.9076 

0 8944 

29 

NYNEX 

0 904  7 

0.8944 

30 

REI 

0 9046 

0 8966 

31 

GTESS.2 

0 9002 

0 8821 

32 

GTESS-1 

0 8996 

0 8826 

33 

NIST.l 

0 8996 

0 8776 

34 

RISO 

0 8899 

0.8699 

36 

ASOL 

0 8893 

0 8677 

36 

COMCOM 

0 8876 

0 8847 

37 

MIME 

0 8874 

0.8683 

38 

NIST.2 

0 8870 

0 8661 

39 

NIST-3 

0 8848 

0 8624 

40 

UPENN 

0 8798 

0 8620 

41 

K AMAN.l 

0 8746 

0 8493 

42 

K AMANJJ 

0 8610 

0 8361 

43 

KAMAN.2 

0 8686 

0.8324 

44 

GMD.2 

0 8381 

0.8138 

46 

KAMAN.S 

0 8374 

0.8138 

46 

VALEN.2 

0 8128 

0.7990 

47 

IFAX 

0 8113 

0 7916 

48 

VALEN.l 

0 8069 

0 7839 

49 

KAMAN.4 

0.7916 

0 7646 

Table  53:  GMD_3  correlation  graph  key  for  digits. 
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aMO.XUPPERCORflELATE 


SYSTEM  NUMBER 


Figure  84:  GMD_3  - upper  case  correlation 


System  Number 

System  N&me 

Correl4tion  ( &11 ) 

Correlation  (correct) 

GMD.J 

1.0000 

1 0000 

2 

GMD.l 

0.9679 

0 8446 

3 

GMD.4 

0.9632 

0.8280 

4 

VOTE-M 

0.8689 

0 8626 

b 

REFERENCE 

0 8678 

0 8678 

6 

AEG 

0 8673 

0 8444 

7 

ATT.4 

0 8670 

0.8401 

S 

VOTEJ* 

0.8647 

0 8437 

9 

UMICH.1 

0.8639 

0.8372 

10 

ERIM.1 

0.8619 

0.8368 

11 

NESTOR 

0.8497 

0.8333 

12 

UBOL 

0 8496 

0.8308 

13 

ATT  J 

0 8482 

0 8333 

14 

KODAKU 

0 8478 

0.8287 

16 

ATTa 

0.8474 

0.8277 

16 

NYNEX 

0 8467 

0 8343 

17 

ATT  J 

0 8467 

0 8278 

18 

IBM 

0 8463 

0 8290 

19 

SYMBUS 

0 8463 

0.8240 

20 

NIST.4 

0 8442 

0 8113 

21 

HUGHES. 1 

0 8416 

0 8260 

22 

HUGHES.2 

0 8414 

0 8238 

23 

GTESS.1 

0 8341 

0 8169 

24 

GTESS.2 

0 8331 

0 8168 

25 

MIME 

0.8323 

0 8076 

26 

OCRSYS 

0 8269 

0 8160 

27 

RISO 

0 8220 

0,7876 

28 

NIST.l 

0 8216 

0.7870 

29 

ASOL 

0 8208 

0 7981 

30 

REI 

0 8016 

0 7861 

31 

KAMAN.l 

0 7927 

0 7686 

32 

NISTJ* 

0 7706 

0.7626 

33 

KAMAN-3 

0.7664 

0.7313 

34 

KAMAN^ 

0.7636 

0 7262 

36 

COMCOM 

0 7462 

0 7391 

36 

IFAX 

0 7401 

0 7196 

37 

NIST-2 

0 7302 

0 7060 

38 

GMD.2 

0,7226 

0.6947 

39 

VALEN-1 

0.7117 

0 6866 

40 

KAMAN.4 

0.6998 

0.6696 

41 

KAMAN.i 

0 6286 

0.6049 

42 

UMICH-2 

0 0644 

0.0149 

Table  54:  GMD_3  correlation  graph  key  for  uppers. 
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OMO.aXOWERCOMIELATE 


SVSTOf  NUMBER 

Figure  85:  GMD_3  - lower  case  correlation 


Sy«tem  Number 

System  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

GMD  J 

1 0000 

l.OOOO 

2 

GMD.4 

0 9522 

0 7613 

3 

GMD.l 

0.9522 

0 7613 

4 

VOTE^ 

0 8292 

0 7759 

5 

REFERENCE 

0 7915 

0 7915 

6 

ATT.4 

0 7878 

0 7390 

7 

ERIM.l 

0 7855 

0 7403 

8 

KODAKJ 

0 7843 

0 7362 

9 

AEG 

0 7826 

0 7391 

10 

UBOL 

0 7809 

0 7283 

11 

VOTE  J" 

0-7763 

0.7477 

12 

ATTJ2 

0.7762 

0 7352 

13 

ATTa 

0-7753 

0.7320 

14 

ATTJ 

0.7738 

0 7260 

15 

NYNEX 

0.7728 

0 7327 

16 

UMICH.1 

0 7713 

0 7265 

17 

NIST.4 

0 7698 

0 7067 

18 

NIST.l 

0.7692 

0.7132 

19 

NESTOR 

0 7642 

0 7247 

20 

IBM 

0 7636 

0 7236 

21 

HUGHES-1 

0 7629 

0 7212 

22 

OCRSYS 

0 7613 

0 7256 

23 

HUGHES.2 

0.7607 

0,7190 

24 

GTESS.l 

0 7Mi 

0.7084 

25 

GTESS.2 

0 7474 

0 7032 

26 

RISO 

0 7449 

0.6917 

27 

ASOL 

0 7283 

0 6841 

28 

NIST-3 

0.7129 

0 6849 

29 

GMD.2 

0 6822 

0 6359 

30 

KAMAN-1 

0 6584 

0 6108 

31 

VALEN.l 

0 6542 

0 6062 

32 

NIST.2 

0 6538 

0 6083 

33 

KAMAN.3 

0 6373 

0 5907 

34 

KAMAN-2 

0 6261 

0.5800 

35 

KAMAN.4 

0 5522 

0 5116 

36 

KAMAN.4 

0.5268 

0 4839 

37 

COMCOM 

0 4712 

0 4594 

38 

UMICH.2 

0.1073 

0.0498 

Table  55:  GMD_3  correlation  graph  key  for  lowers. 
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SYSTEM:  GMD.4 


PARTICIPANT:  Frank  Smieja 

ORGANIZATION:  Gesellschaft  fuer  Mathematik  und  Datenverarbeitung 
(GMD) , S2uikt  Augustin,  Germany 

FEATURES:  genetically  optimized  polynomial  filter 

CLASSIFICATION:  statistical 


HARDWARE: 

SPARC2 

TRAINING: 

DIGITS 

UPPERS 

LOWERS 

DATABASE 

15000 

22000 

22000 

NSDB3 

STATUS : 

on  time 

RESULTS:  — DIGITS  — 

— UPPERS  — — LOWERS  — DATABASE 

REJ. 

ERR. 

REJ. 

ERR.  REJ. 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE 

RATE—  RATE 

RATE— 

0.00 

0.1016 

0.00 

0.1585  0.00 

0.2254 

OCR  RATE  (CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

1.18 

0.48 

0.40 

CPU  RATE: 
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SYSTEM:  GMD.4 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 
(15J[16| 
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NUMBER  WRITERS  WITH  ERROR  » E 


GMD  4 — DIGITS  UPPERS  LOWERS 


REJECTION  RATE  (%) 


Figure  86:  Error  rate  versus  rejection  rate  for  GMD_4 


01104 


Figure  87:  Error  rate  per  writer  of  GMD_4 
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aM0_4.0iaT.CORnELATE 


SYSTEM  NUMBEfl 

Figure  88;  GMD_4  - digit  correlation 


Sy*lem  Number 

System  N»me 

CorrelAtion  f ftll ) 

Correl*iion  (correct) 

1 

gMD.4 

1.0000 

1.0000 

2 

GMD.l 

0 9786 

0 8964 

3 

GMD  J 

0 9498 

0.8864 

4 

VOTE^ 

0 9069 

0.8930 

5 

AEG 

0 8998 

0 8868 

6 

VOTEJ= 

0 8988 

0 8890 

7 

REFERENCE 

0 8984 

0 8984 

8 

NIST.4 

0 8977 

0.8783 

9 

ELSAGBJ 

0.8972 

0.8849 

10 

OCRSYS 

0.8971 

0.8918 

11 

ELSAGB^ 

0 8970 

0 8846 

12 

ERIM.l 

0.8962 

0 8821 

13 

ATTJ 

0 8946 

0 8843 

14 

KODAKJJ 

0 8946 

0.8807 

13 

UBOL 

0 8944 

0 8792 

16 

THINK-1 

0 8944 

0 8771 

1 7 

ERIM.2 

0 8942 

0.8816 

18 

ATT.4 

0 8935 

0.8801 

19 

ATT.2 

0 8933 

0.8822 

20 

IBM 

0 8928 

0 8823 

21 

ELSAGB.l 

0 8910 

0.8751 

22 

ATTJ 

0 8907 

0 8768 

23 

SYMBUS 

0 8904 

0 8768 

24 

KODAK.! 

0 8904 

0.8760 

25 

NESTOR 

0 8891 

0.8769 

26 

THINK.2 

0 8873 

0.8778 

27 

HUGHES-1 

0 8866 

0 8740 

28 

HUGHES_2 

0 8860 

0 8738 

29 

REI 

0 8844 

0 8766 

30 

NYNEX 

0.8840 

0 8743 

31 

GTESS.2 

0 8788 

0 8620 

32 

GTESS.l 

0 8783 

0 8625 

33 

NIST.l 

0 8763 

0 8564 

34 

COMCOM 

0 8686 

0 8657 

35 

ASOL 

0 8683 

0 8479 

36 

MIME 

0 8666 

0 8480 

37 

RISO 

0 8648 

0 8382 

38 

NIST.2 

0 8643 

0 8466 

39 

NIST.3 

0 8617 

0.8417 

40 

UPENN 

0 8694 

0 8425 

41 

KAMAN.l 

0 8620 

0 8289 

42 

KAMANJJ 

0 8384 

0.8147 

43 

KAMAN.2 

0 8363 

0 8123 

44 

KAMAN.4 

0 8170 

0 7946 

45 

GMD.2 

0 8166 

0 7948 

46 

IFAX 

0 7946 

0 7744 

47 

VALEN.2 

0 7944 

0 7807 

48 

VALEN.1 

0 7888 

0.7662 

49 

KAMAN.4 

0.7713 

0.7463 

Table  56:  GMD_4  correlation  graph  key  for  digits. 
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aMO_4.UPPEILCOIWELATE 


SYSTEM  NUMBER 


Figure  89:  GMD_4  - upper  case  correlation 


System  Number 

System  N«me 

Correlation  ( all ) 

Correlation  (correct) 

1 

GMD.4 

1 0000 

1 0000 

2 

GMD.l 

0 9696 

0 8318 

i 

GMD  J 

0 9632 

0 8280 

4 

VOTEJVl 

0.8613 

0.8369 

5 

REFERENCE 

0 8416 

0 8416 

6 

AEG 

0 8409 

0 8282 

7 

ATT.4 

0 8386 

0 8230 

8 

VOTEJ* 

0 8369 

0 8266 

9 

UMICH.l 

0 8344 

0 8197 

10 

ERIM-1 

0 8341 

0 8192 

1 1 

UBOL 

0 8323 

0 8144 

12 

NESTOR 

0 8318 

0 8166 

U 

ATTJ 

0 8304 

0 8116 

L4 

ATTJ 

0 8303 

0.8164 

ib 

ATTJ 

0 8294 

0 8119 

16 

KODAKJ 

0 8292 

0.8118 

17 

NYNEX 

0 8288 

0 8174 

IS 

NIST.4 

0 8282 

0.7967 

19 

IBM 

0 8277 

0 8124 

20 

SYMBUS 

0 8270 

0 8070 

21 

HUGHES. 1 

0 8246 

0 8089 

22 

HUGHES-2 

0 8241 

0 8076 

23 

GTESS.l 

0 8166 

0.8004 

24 

GTESS.2 

0 8161 

0.7996 

2& 

MIME 

0 8140 

0 7913 

26 

OCRSYS 

0.8090 

0.7988 

27 

ASOL 

0 8061 

0 7836 

28 

NIST.l 

0 8036 

0.7711 

29 

RISO 

0 8009 

0 7697 

30 

REI 

0.7834 

0,7688 

31 

KAMAN.l 

0 7763 

0 7632 

32 

NIST  J 

0.7642 

0 7370 

33 

KAMAN J 

0 7406 

0 7166 

34 

K AMAN.2 

0.7360 

0 7097 

36 

COMCOM 

0,7324 

0 7261 

36 

IFAX 

0.7240 

0 7042 

37 

NISTJ 

0 7142 

0.6904 

38 

GMD-2 

0 7061 

0 6796 

39 

VALEN.l 

0 6966 

0.6716 

40 

KAMAN.4 

0 6836 

0.6660 

41 

KAMAN.6 

0 6171 

0 6928 

42 

UMICHJ 

0 0606 

0 0169 

Table  57:  GMD_4  correlation  graph  key  for  uppers. 
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QMO_4a.OWERCOWIELATE 


SYSTEM  NUMSen 

Figure  90:  GMD.4  - lower  case  correlation 


System  Number 

System  Name 

Correlation  ( ail ) 

Correlation  (correct) 

1 

(iMD.4 

1 0000 

1 0000 

2 

GMD-l 

1 0000 

0 7746 

3 

GMD  J 

0 9622 

0 7613 

4 

VOTEJVl 

0 *117 

0,7591 

5 

REFERENCE 

0 7746 

0 7746 

6 

ATT.4 

0 7718 

0 7232 

7 

ERIM.l 

0-7693 

0.7247 

S 

KODAKJ 

0 7673 

0.7199 

9 

AEG 

0.766S 

0.7235 

10 

UBOL 

0.7644 

0 7122 

ll 

ATTJ 

0.7623 

0.7176 

12 

VOTEJ> 

0 7602 

0.7326 

13 

ATTJ2 

0.7597 

0 7193 

14 

NYNEX 

0.75S2 

0-7182 

lb 

ATTJ 

0 7574 

0 7100 

16 

UMICH.l 

0 7571 

0 7117 

17 

NIST.l 

0.7542 

0 6981 

IS 

NIST.4 

0 7541 

0 6915 

19 

IBM 

0 7489 

0.7088 

20 

NESTOR 

0 7485 

0-7095 

21 

OCRSYS 

0 7473 

0 7110 

22 

HUGHES-1 

0 7472 

0-7057 

23 

HUGHES-2 

0 7489 

0 7040 

24 

GTESS-1 

0 7352 

0-6931 

25 

GTESS-2 

0 7327 

0 6888 

26 

RISO 

0 7285 

0 6765 

27 

ASOL 

0.7146 

0.6706 

2S 

NISTJ 

0 6983 

0 6709 

29 

GMD.2 

0 6692 

0 6226 

30 

KAMAN.l 

0 64  70 

0-5987 

31 

VALEN.1 

0 6428 

0 5945 

32 

NISTJ 

0.6395 

0.5954 

33 

KAMANJ 

0 6270 

0 5795 

34 

KAMAN.2 

0 6148 

0 5684 

35 

KAMANJ 

0 5438 

0 5020 

36 

KAMAN.4 

0 5192 

0 4757 

37 

COMCOM 

0 4622 

0 4513 

3S 

UMICH-2 

0.1098 

0 0478 

Table  58:  GMD_4  correlation  graph  key  for  lowers. 
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SYSTEM:  GTESS.l 


PARTICIPANT:  Dr.  Vadim  Anshelevich 
ORGANIZATION:  GTESS  CORPORATION,  Richardson,  TX 

PREPROCESSING:  size  normalization,  deskewing,  and  dimension  reduction. 

FEATURES:  vectors  from  non-linear  transformations,  result  is  200-400 
dimensional  vector. 


CLASSIFICATION:  MLP , training  performed  with  variant  of  perceptron 
training  algorithm  modified  for  f eed-f orwaird  network. 

HARDWARE:  50  MHz  486 


TRAINING: 

DIGITS 

UPPERS 

LOWERS 

DATABASE 

1/4 

3/4 

3/4 

NSDB3 

4983 

8217 

7103 

INTERNAL 

"70 

-70 

-70 

writers 

STATUS:  on  time,  corrected  CON  files  9 days  late 

RESULTS:  — DIGITS  — — UPPERS  — — LOWERS  — DATABASE 


REJ. 

ERR. 

REJ. 

ERR. 

REJ. 

ERR. 

TESTDATAl 

RATE 

RATE— 

RATE 

RATE— 

RATE 

RATE— 

0.00 

0.0659 

0.00 

0.0801 

0.00 

0.1753 

0.10 

0.0284 

0.10 

0.0374 

0.10 

0.1296 

0.20 

0.0202 

0.20 

0.0186 

0.20 

0.0918 

0.30 

0.0202 

0.30 

0.0169 

0.30 

0.0613 

0.40 

0.0205 

0.40 

0.0167 

0.40 

0.0575 

0.50 

0.0210 

0.50 

0.0172 

0.50 

0.0580 

OCR  RATE  (CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

15.51 

3.43 

3 

.51 

CPU  RATE: 
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SYSTEM:  GTESS.l 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 
none 

COMMENTS:  GTESS 
COMPANY  INFORMATION 

GTESS  Corporation  was  founded  in  Jcinuajy,  1991.  It  currently  employs  6 people  and  is  in  the 
business  of  providing  inexpensive  PC-based  hand  print  and  machine  print  form  recognition  systems. 
Areas  of  interest  are:  character  recognition,  character  segmentation,  form  pre-processing,  form 
post-processing  (context),  case  independence  (lower /upper),  style  independence  (machine/ hand). 

RECOGNITION  TECHNOLOGY  DESCRIPTION  SUMMARY 

We  use  a two  stage  isolated  character  recognition  engine  composed  of  1 ) reduction  and  normalization 
2)  neural  classification 

Instead  of  back  propagation,  we  use  a modified  perceptron  training  algorithm  which  allows  us  to 
retrain  our  network  in  a matter  of  hours  rather  than  weeks.  Training  and  production  algorithms 
do  not  require  floating  point,  are  portable,  run  on  PC  platforms  without  special  hardware  and 
recognize  at  the  rate  of  10-100  characters/sec  on  a 50  MHz  486,  depending  upon  the  alphabet. 
Inexpensive  DSP  implementations  are  also  being  developed  for  high  performance  systems. 

INTERPRETATION  OF  NIST  CONFERENCE  RESULTS 

We  feel  that  our  current  algorithms  offer  an  attractive  compromise  between  reliability  of  recognition 
and  economy  of  implementation.  The  Conference  results  indicate  to  us  that  we  are  able  to  achieve 
one  of  the  best  overall  reliability  recognition  rates  among  the  peirticipants  which  relied  only  on 
NIST  supplied  training  data. 

PRODUCTS 

Within  the  next  few  months,  GTESS  will  start  distributing  two  products: 

1)  A PC-based,  all  software  form  recognition  subsystem; 

2)  A field  recognition  engine  under  Windows  3.x  to  be  used  in  form  processing  applications. 
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ERROR  RATE  (%) 


GTESS  1 — DIGITS  UPPERS  LOWERS 


REJECTION  RATE  (%) 


Figure  91:  Error  rate  versus  rejection  rate  for  GTESS.l 


aTE88_1 


Figure  92:  Error  rate  per  writer  of  GTESS-l 


171 


aTE8S_1J>nfT.CORRELATE 


8Y8TOI  NUUBCR 

Figure  93:  GTESS.l  - digit  correlation 


Sy«iem  Number 

Syalem  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

gteSsu 

1 0000 

l.OOOO 

2 

GTESS J 

0 9806 

0.9260 

Z 

VOTEJVl 

0.9436 

0 9287 

4 

ATT.4 

0 9366 

0.9181 

6 

AEG 

0 9362 

0 9203 

6 

VOTEJ* 

0 9347 

0 9244 

7 

REFERENCE 

0 9341 

0 9341 

8 

OCRSYS 

0 9336 

0 9277 

9 

ERIM.l 

0 9334 

0 9182 

10 

ELSAGB J 

0 9330 

0 9203 

11 

ELSAGB J 

0 9328 

0 9200 

12 

ATTJ 

0 9327 

0.9209 

13 

ATTJ 

0 9326 

0 9186 

14 

KODAK_2 

0 9322 

0 9170 

IS 

ERIMJ 

0 9283 

0 9166 

16 

KODAKJ 

0 9279 

0 9126 

17 

NIST.4 

0 9277 

0.9101 

IS 

UBOL 

0 9276 

0 9129 

19 

THINKJ 

0 9273 

0 9107 

20 

IBM 

0 9272 

0,9170 

21 

ELSAGB.l 

0 9270 

0 9097 

22 

ATTJ 

0 9269 

0.9111 

23 

SYMBUS 

0 9243 

0,9106 

24 

NESTOR 

0 9222 

0 9107 

2S 

THINK  J 

0 9213 

0 9124 

26 

HUGHES. 1 

0 9207 

0 9081 

27 

HUGHES.2 

0 9207 

0 9080 

2S 

REI 

0 9188 

0 9109 

29 

NYNEX 

0 9179 

0.9093 

30 

NIST.l 

0 9077 

0.8882 

31 

NISTJ 

0 9068 

0 8816 

32 

NISTJ 

0 9032 

0.8773 

33 

COMCOM 

0 9013 

0 8986 

34 

ASOL 

0 9006 

0.8799 

3S 

MIME 

0.9004 

0.8814 

36 

GMDJ) 

0 8996 

0 8826 

37 

UPENN 

0 8946 

0.8764 

38 

RISC 

0 8929 

0 8677 

39 

GMD.l 

0 8924 

0.8764 

40 

KAMAN.l 

0 8821 

0 8693 

41 

GMD-4 

0 8783 

0 8626 

42 

KAMAN_3 

0 8663 

0.8432 

43 

KAMAN J 

0.8626 

0 8406 

44 

GMD.2 

0 8460 

0.8226 

4S 

KAMAN.S 

0 8382 

0 8211 

46 

VALENJ 

0 8212 

0.8102 

47 

IFAX 

0 8192 

0.8016 

48 

VALEN.1 

0 8068 

0.7899 

49 

KAMAN.4 

0 7946 

0.7713 

Table  59:  GTESS.l  correlation  graph  key  for  digits. 
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GTESS_1.UPPeR.C0Ri)ELATE 


SYSTEM  NUMBER 


Figure  94;  GTESS.l  - upper  Ccise  correlation 


System  Number 

Sy»lem  Name 

Correlatton  { all ) 

Correlation  (correct) 

1 

GTESSU 

1 0000 

1-0000 

2 

GTESS J 

0 9864 

0 9139 

3 

VOTE-M 

0 9286 

0 9119 

4 

REFERENCE 

0.9199 

0 9199 

& 

AEG 

0 9189 

0 9042 

6 

ATT.4 

0 9168 

0.8981 

7 

ERIM.l 

0.9101 

0 8930 

% 

ATT  J 

0.9101 

0.8921 

9 

NYNEX 

0.9079 

0 8942 

10 

KODAK J 

0 9064 

0.8864 

ll 

VOTEJ* 

0 9063 

0.8939 

12 

UBOL 

0.9044 

0 8863 

13 

ATT-1 

0 9034 

0.8833 

14 

ATTJ 

0.9003 

0.8824 

13 

UMICH.l 

0 8997 

0 8883 

16 

NESTOR 

0 8985 

0 8833 

17 

SYMBUS 

0 8977 

0 8794 

18 

HUGHES-l 

0 8936 

0 8812 

19 

HUGHES_2 

0 8940 

0 8791 

20 

IBM 

0 8926 

0 8802 

21 

OCRSYS 

0 8838 

0 8737 

22 

MIME 

0.8764 

0 8338 

23 

NIST.4 

0 8683 

0 8301 

24 

ASOL 

0 8662 

0.8462 

23 

RISO 

0.8449 

0 8219 

26 

REI 

0 8442 

0 8327 

27 

NIST.l 

0.8429 

0 8217 

26 

GMD.l 

0 8330 

0 8186 

29 

GMD.3 

0.8341 

0.8169 

30 

KAMAN.l 

0 8240 

0 8076 

31 

GMD.4 

0 8166 

0 8004 

32 

NIST  J 

0 8123 

0.7944 

33 

COMCOM 

0 7931 

0.7873 

34 

IFAX 

0.7822 

0,7649 

33 

KAMAN.3 

0 7807 

0 7643 

36 

KAMAN.2 

0 7724 

0.7334 

37 

NIST-2 

0.7643 

0,7414 

38 

GMD-2 

0 7439 

0.7233 

39 

VALEN.l 

0 7337 

0-7196 

40 

KAMAN.4 

0 71 16 

0 6939 

41 

KAMAN.3 

0 6443 

0 6303 

42 

UMICH_2 

0 0497 

0,0231 

Table  60:  GTESS.l  correlation  graph  key  for  uppers. 
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arrE8S_1.U>WERCOmELATE 


SVSTEH  NUMBER 

Figure  95:  GTESS.l  - lower  case  correlation 


Sy*iem  Number 

System  N*me 

Correlation  ( all ) 

Correlation  (correct) 

1 

GTESS.l 

1 0000 

l.OOOO 

2 

GTESS  J 

0.8892 

0 7779 

3 

VOTEJ4 

0 8547 

0 8048 

i 

ATT  J 

0 8267 

0 7734 

6 

REFERENCE 

0 8248 

0 8248 

6 

ATT.4 

0 8176 

0 7678 

7 

ERIM.l 

0 8167 

0.7691 

8 

AEG 

0 8148 

0 7702 

9 

ATTJ 

0.8114 

0 7647 

10 

OCRSYS 

0 8083 

0.7641 

1 1 

KOD AKU 

0 8058 

0 7612 

12 

IBM 

0 8023 

0.7556 

13 

NYNEX 

0 8015 

0 7619 

14 

ATTJ 

0.7972 

0 7496 

15 

UBOL 

0 7965 

0 7504 

16 

UMICH.l 

0 7943 

0.7517 

17 

HUGHES. 1 

0 7937 

0 7493 

18 

VOTEJ> 

0.7931 

0 7662 

19 

HUGHES.2 

0.7917 

0 7473 

20 

NESTOR 

0.7898 

0 7498 

21 

NIST.l 

0.7763 

0 7307 

22 

RISO 

0 7690 

0 7131 

23 

NIST.4 

0.7628 

0 7153 

24 

ASOL 

0.7561 

0-7084 

25 

GMD.3 

0 7515 

0 7084 

26 

NISTJJ 

0.7429 

0 7107 

27 

GMD.4 

0 7352 

0 6931 

28 

GMD.l 

0 7352 

0 6931 

29 

GMD.2 

0 7016 

0 6537 

30 

NISTJ2 

0.6843 

0 6331 

31 

KAMAN.l 

0 6619 

0 6208 

32 

VALEN.l 

0 6587 

0 6168 

33 

KAMAN.3 

0 6413 

0 6007 

34 

K AMAN.2 

0 6245 

0 5880 

35 

K AMAN.i 

0 5527 

0 5203 

36 

KAMAN.4 

0 5203 

0 4888 

37 

COMCOM 

0 4873 

0 4757 

38 

UMICH  J 

0, 1026 

0 0518 

Table  61:  GTESS.l  correlation  graph  key  for  lowers. 
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SYSTEM:  GTESS_2 


PARTICIPANT:  Dr.  Vadim  Anshelevich 


ORGANIZATION:  GTESS  CORPORATION,  Richardson.  TX 
FEATURES:  vectors  from  non-linear  transformations 


CLASSIFICATION:  MLP 
HARDWARE:  50  MHz  486 


TRAINING: 

DIGITS 

UPPERS 

LOWERS 

DATABASE 

1/4 

3/4 

3/4 

NSDB3 

4983 

8217 

7103 

INTERNAL 

'70 

'70 

1 

o 

writers 

STATUS : 

on  time 

, corrected  CON 

I files  9 days 

late 

RESULTS : 

— DIGITS  — 

— UPPERS  — 

— LOWERS  — 

DATABASE 

REJ. 

ERR. 

REJ. 

ERR. 

REJ. 

ERR. 

TESTDATAl 

RATE 

RATE— 

RATE 

RATE— 

RATE 

RATE— 

0.00 

0.0675 

0.00 

0.0814 

0.00 

0.1842 

0.10 

0.0301 

0.10 

0.0381 

0.10 

0.1358 

0.20 

0.0188 

0.20 

0.0198 

0.20 

0.0992 

0.30 

0.0189 

0.30 

0.0176 

0.30 

0.0684 

0.40 

0.0194 

0.40 

0.0173 

0.40 

0.0515 

0.50 

0.0203 

0.50 

0.0176 

0.50 

0.0522 

OCR  RATE 

(CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

18.80 

3.37 

3, 

.39 

CPU  RATE: 
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SYSTEM:  GTESS^ 

BIBLIOGIL\PHY: 

The  following  references  have  been  provided  for  this  system: 
none 
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NUMBER  WRITERS  WITH  ERROR 


UJ 

H 

S 

a 

o 

(X 

a. 

UJ 


GTESS  2 — DIGITS  UPPERS  LOWERS 


REJECTION  RATE  {%) 


Figure  96:  Error  rate  versus  rejection  rate  for  GTESS_2 


aTES8_2 


Figure  97:  Error  rate  per  writer  of  GTESS_2 
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arCSS.ZJMaiT.CORRELATE 


SYSTEM  NUMBER 

Figure  98:  GTESS_2  - digit  correlation 


System  Number 

System  N&me 

Correlation  ( all ) 

Correlation  (correct) 

GTESS  J 

1 0000 

1 0000 

2 

GTESS.1 

0 9806 

0 9230 

3 

VOTE^ 

0.9429 

0 9277 

4 

ATT.4 

0.9336 

0 9173 

b 

AEG 

0.9346 

0 9193 

6 

VOTE_P 

0 9341 

0 9234 

7 

ERIM.l 

0.9333 

0 9173 

8 

REFERENCE 

0 9323 

0.9323 

9 

ATT  J 

0.9322 

0.9200 

10 

ELSAGB J 

0 9322 

0 9189 

11 

OCRSYS 

0 9320 

0 9261 

12 

ELSAGB J 

0 9319 

0 9186 

13 

ATT  J 

0 9318 

0.9176 

14 

KODAK-2 

0 9318 

0.9162 

13 

ER1M.2 

0 9280 

0 9147 

16 

NIST.4 

0 9280 

0 9096 

17 

KODAK-1 

0 92T3 

0 9119 

18 

THINK.l 

0 9268 

0 9098 

19 

UBOL 

0 9263 

0 9116 

20 

ELS  AGB.l 

0 9263 

0 9087 

21 

IBM 

0 9262 

0,9137 

22 

ATTJ 

0 9232 

0 9101 

23 

SYMBUS 

0 9234 

0 9093 

24 

NESTOR 

0-9213 

0 9096 

23 

THINK  J 

0 9206 

0 9113 

26 

HUGHES, I 

0.9193 

0 9067 

27 

HUGHES-2 

0 9192 

0 9066 

28 

REI 

0 9169 

0 9092 

29 

NYNEX 

0.9139 

0 9074 

30 

NlST-1 

0.9080 

0 8876 

31 

NIST-2 

0 9071 

0.8810 

32 

NIST-3 

0 9043 

0 8772 

33 

MIME 

0 9007 

0 8809 

34 

GMD_3 

0 9002 

0 8821 

33 

ASOL 

0 9001 

0 8790 

36 

COMCOM 

0 9000 

0 8972 

37 

UPENN 

0 8941 

0 8748 

38 

GMD-1 

0.8931 

0.8761 

39 

RISC 

0 8930 

0 8669 

40 

KAMAN-l 

0 8833 

0.8393 

41 

GMD-4 

0.8788 

0 8620 

42 

KAMAN-3 

0.8662 

0 8432 

43 

KAMAN.2 

0 8642 

0 8408 

44 

GMD-2 

0.8437 

0 8223 

43 

KAMAN-4 

0 8392 

0 8212 

46 

VALEN-2 

0.8220 

0.8102 

47 

IFAX 

0.8183 

0 8007 

48 

valen.i 

0 8040 

0.7884 

49 

KAMAN-4 

0.7963 

0 7715 

Table  62:  GTESS-2  correlation  graph  key  for  digits. 
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aTE8S_rUf>PeR.C0fWELATE 


SYSTEM  NUMBER 

Figure  99:  GTESS_2  - upper  case  correlation 


System  Number 

System  Name 

Correlation  ( all) 

Correlation  (correct! 

GTESS^ 

1 0000 

l.OOOO 

2 

GTESS.l 

0 9064 

0 9139 

3 

VOTE^ 

0 9274 

0.9107 

4 

REFERENCE 

0 9106 

0 9106 

h 

AEG 

0 9172 

0 9024 

6 

ATT.l 

0 9157 

0 0967 

7 

ERIM.l 

0 9101 

0.0942 

0 

ATTJ 

0 9091 

0.0912 

9 

NYNEX 

0 9065 

0.0920 

10 

KODAKU 

0 9060 

0.0054 

1 1 

VOTEJ* 

0 9044 

0 0943 

12 

UBOL 

0 9031 

0.0049 

13 

ATTJ 

0 9017 

0 0041 

14 

ATT-3 

0 0999 

0.0015 

15 

UMICH.l 

0 0906 

0 0069 

16 

NESTOR 

0 0960 

0 0040 

17 

SYMBUS 

0 0965 

0 0701 

10 

HUGHES. 1 

0 0946 

0 0790 

19 

HUGHES.2 

0 0924 

0.0775 

20 

IBM 

0 0914 

0 0791 

21 

OCRSYS 

0 0039 

0 0723 

22 

MIME 

0 0751 

0 0546 

23 

NIST.4 

0 0663 

0 0400 

24 

ASOL 

0 0650 

0 0455 

2S 

REI 

0 0426 

0 0312 

26 

RISC 

0 0425 

0 0204 

27 

NIST.l 

0 0404 

0 0203 

20 

GMD.l 

0 0345 

0 0170 

29 

GMD.3 

0 0331 

0 0150 

30 

KAMAN.l 

0 0223 

0,0065 

31 

GMD.< 

0 0161 

0 7996 

32 

NISTJ 

0 0117 

0 7934 

33 

COMCOM 

0 7930 

0.7070 

34 

IFAX 

0 7000 

0 7640 

35 

KAMAN.3 

0 7792 

0,7633 

36 

KAMAN_2 

0.7722 

0.7546 

37 

NIST-2 

0 7625 

0 7406 

30 

GMD.2 

0 7450 

0,7239 

39 

VALEN.I 

0,7350 

0.7107 

40 

KAMAN.4 

0,7117 

0.6933 

41 

KAMAN-i 

0 6434 

0.6294 

42 

UMICHJ 

0 0491 

0 0229 

Table  63:  GTESS_2  correlation  graph  key  for  uppers. 
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GTESS_2.U}WERCORHELATE 


SYSTEM  NUMBER 

Figure  100:  GTESS_2  - lower  Ccise  correlation 


System  Number 

System  N%me 

CorrelAtion  ( &1I ) 

Correlation  (correct) 

1 

GTESSJ 

1 0000 

1.0000 

2 

GTESS.l 

0 8892 

0.7779 

3 

VOTEJ^ 

0 84  70 

0 7965 

4 

ATT.2 

0 8184 

0 7652 

b 

REFERENCE 

0 81S8 

0 8158 

6 

ERIM-1 

0 812S 

0 7641 

7 

ATT.4 

0 81 12 

0 7605 

9 

AEG 

0 8067 

0 7617 

9 

KODAKJ 

0 8040 

0 7573 

10 

ATTJ 

0 8018 

0 7563 

U 

NYNEX 

0 8000 

0 7570 

12 

ATT  J 

0 7918 

0 7431 

13 

IBM 

0,7900 

0 7456 

14 

VOTE-P 

0 7878 

0 7601 

IS 

UBOL 

0.78SI 

0.7410 

16 

OCRSYS 

0.7843 

0 7483 

17 

NESTOR 

0 7822 

0 7429 

18 

HUGHES. 1 

0 7822 

0 7393 

19 

UMICH.l 

0.7799 

0 7407 

20 

HUGHES.2 

0.7796 

0 7370 

21 

NIST.l 

0 7740 

0,7262 

22 

ASOL 

0 7547 

0 7058 

23 

RISO 

0 7538 

0,7039 

24 

NIST.4 

0 7523 

0.7067 

2S 

GMD-3 

0 74  74 

0 7032 

26 

NISTJJ 

0 7413 

0 7079 

27 

GMD.4 

0.7327 

0 6888 

28 

GMD.l 

0 7327 

0 6888 

29 

GMD.2 

0.6992 

0 6507 

30 

NISTJ 

0,6783 

0 6279 

31 

KAMAN.l 

0 6571 

0.6157 

32 

VALEN.l 

0 6550 

0,6117 

33 

KAMAN.3 

0 6363 

0 5949 

34 

KAMAN.2 

0 6198 

0.5828 

3S 

KAMAN.i 

0 5490 

0 5156 

36 

KAMAN-4 

0 5224 

0 4880 

37 

COMCOM 

0 4830 

0 4712 

38 

UMICH.2 

0,1070 

0 0542 

Table  64:  GTESS_2  correlation  graph  key  for  lowers. 
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SYSTEM:  HUGHES. 1 


PARTICIPANT : Tony  Baraghimian 

ORGANIZATION:  Hughes  Aircraft  Company,  Canoga  Pairk,  CA 
FEATURES:  ? 


CLASSIFICATION:  fusion  of  results  of  multiple  nonparametric 
algorithms  (neocognitron) 


HARDWARE: 

TRAINING: 

STATUS : 


single  Intel  i860  in  a Datacube  computer 
DIGITS  UPPERS  LOWERS  DATABASE 


10000 


on  time 


7800 


7800 


NSDB3 


RESULTS:  — DIGITS  — — UPPERS  — 


REJ. 

ERR. 

REJ. 

ERR. 

REJ. 

ERR. 

RATE 

RATE— 

RATE 

RATE— 

RATE 

RATE— 

0.00 

0 . 0484 

0.00 

0.0646 

0.00 

0.1539 

0.10 

0.0173 

0.10 

0.0301 

0.10 

0.1129 

0.20 

0.0064 

0.20 

0.0169 

0.20 

0.0806 

0.30 

0.0036 

0.30 

0.0105 

0.30 

0.0529 

0.40 

0.0022 

0.40 

0.0071 

0.40 

0.0362 

0.50 

0.0015 

0.50 

0.0055 

0.50 

0.0270 

- LOWERS  — DATABASE 


TESTDATAl 


OCR  RATE  (CPS):  DIGITS 


SYS  RATE: 


UPPERS 


LOWERS 


CPU  RATE: 


21.00 


19.00 


19.00 


NOTE:  proprietary  architecture  using  a neural  net  classifier.  Few  details  of  recognition  algorithm 
provided. 
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SYSTEM:  HUGHES.l 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 

[19] 

COMMENTS:  HUGHES 

HUGHES  Recognition  Systems  brings  complete  document  image  processing  solutions  based  on 
its  proven  success  in  advanced  imaging  and  recognition  technology.  We  provide  a wide  range  of 
technology,  solutions,  and  services  from  system  analysis  through  system  integration,  training,  and 
support. 

HUGHES  develops  sophisticated  subsystem  solutions  easily  tailorable  to  your  application  for  pre- 
processing, intelligent  recognition,  contextual  analysis,  and  more.  We  accommodate  image  lift  from 
a variety  of  sources  directly  into  our  pre-processing  subcomponent.  We  apply  unique  pre-processing 
techniques  such  as  image  quality  control,  registration,  and  enhancement,  as  well  as  form  identi- 
fication, suppression,  and  field  isolation.  The  result  feeds  immediately  into  HIGHES’  intelligent 
recognition  subcomponent,  or  any  other  you  provide.  With  technologies  such  as  artificial  networks 
and  fuzzy  logic,  our  pre-processing  in  concert  with  our  intelligent  recognizer  provides  maximum 
performance.  The  flexible  pre-processing  also  enables  higher  performance  of  your  own  recognition 
system.  Further  enhancements  to  recognition  performance  is  accomplished  by  contextual  analysis 
in  our  post-processing  subcomponent. 

We  also  offer  traditioned  subsystems  for  image  acquisition,  format  conversion,  work  flow,  forms 
editing,  image  storage,  and  much  more. 

HUGHES  Recognition  Systems  participating  in  the  First  Census  OCR  Systems  Conference  in  May 
1992.  Our  test  results  were  highly  competitive,  among  the  top  performing  group  of  participants. 

For  more  information,  please  contact  Tony  Baraghimian  at  (818)  702-1580. 
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ERROR  RATE  (%) 


HUGHES  1 — DIGITS  UPPERS  LOWERS 


REJECTION  RATE  (%) 


Figure  101:  Error  rate  versus  rejection  rate  for  HUGHES-1 


HUCHES.I 


Figure  102:  Error  rate  per  writer  of  HUGHES.l 
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HUGHES_1.0IGIT.COWe-ATE 


SYSTEM  NUMBER 


Figure  103:  HUGHES.l  - digit  correlation 


System  Number 

System  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

HUGHES. 1 

1 0000 

1 0000 

2 

HUGHES. 2 

0 9913 

0 9479 

3 

VOTEJH 

0 9383 

0 9448 

4 

AEG 

0 9326 

0.9378 

S 

OCRSYS 

0 9320 

0 9433 

6 

REFERENCE 

0 9316 

0 9316 

7 

ERIM.l 

0 9493 

0 9343 

8 

VOTEJ> 

0 9483 

0 9383 

9 

IBM 

0 9474 

0 9331 

10 

ERIM.2 

0 9473 

0.9329 

1 1 

ATTJ 

0 9468 

0 9341 

12 

ELSAGB.J 

0 9437 

0 9330 

13 

ELSAGB^ 

0 9433 

0.9347 

M 

KODAK.2 

0 9442 

0 9306 

13 

ATT.4 

0 9439 

0 9303 

16 

ATTJ 

0 9433 

0 9342 

17 

THINKS 

0 9426 

0 9311 

18 

UBOL 

0 9401 

0 9273 

19 

NESTOR 

0 9393 

0 9268 

20 

ELSAGB.l 

0 9392 

0 9243 

21 

KODAKJ 

0 9388 

0 9231 

22 

REI 

0.9377 

0 9282 

23 

NIST.4 

0 9377 

0 9237 

24 

SYMBUS 

0.9366 

0.9242 

23 

NYNEX 

0 9349 

0 9230 

26 

ATTJ 

0.9347 

0.9233 

27 

THlNK.l 

0 9341 

0 9220 

28 

GTESS.l 

0 9207 

0 9081 

29 

COMCOM 

0 9206 

0 9170 

30 

GTESS  J 

0 9193 

0 9067 

31 

NIST.l 

0.9091 

0 8966 

32 

GMD.3 

0 9082 

0 8947 

33 

UPENN 

0 9043 

0 8877 

34 

ASOL 

0 9018 

0 8878 

33 

MIME 

0 9014 

0.8890 

36 

GMD.l 

0 9010 

0 8881 

37 

NIST-2 

0 8993 

0.8837 

38 

NISTJ 

0 8949 

0 8811 

39 

RISC 

0 8890 

0 8732 

40 

GMD-4 

0 8866 

0.8740 

41 

KAMAN.l 

0.8803 

0 8633 

42 

KAMAN.3 

0 8664 

0 8308 

43 

KAMAN.2 

0 8631 

0 8480 

44 

KAMAN.S 

0.8443 

0.8304 

43 

GMD.2 

0 8416 

0 8273 

46 

VALENJ 

0 8371 

0 8234 

47 

IFAX 

0 8266 

0.8113 

48 

VALEN.1 

0.8144 

0 8003 

49 

KAMAN.4 

0 7894 

0.7760 

Table  65:  HUGHES.l  correlation  graph  key  for  digits. 
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HUGHCS.I.UPPCRCOfWELATE 


SYSTEM  NtJMBER 

Figure  104:  HUGHES_1  - upper  case  correlation 


S y«icm  N u m bcr 

Sy«iem  Name 

Correlation  1 all ) 

Correlation  (correct) 

HUGHES.l 

1 0000 

1-0000 

2 

HUGHES. 2 

0 9907 

0 9296 

3 

VOTE-M 

0 9411 

0.9254 

4 

REFERENCE 

0 9354 

0.9354 

5 

AEG 

0 9348 

0.9198 

6 

ERIM.l 

0 9252 

0 9083 

7 

ATT.4 

0 9245 

0 9088 

8 

UMICH.l 

0.9199 

0 9056 

9 

UBOL 

0 9188 

0 8997 

10 

NYNEX 

0 9182 

0 9060 

1 1 

VOTE_P 

0 9158 

0 9055 

12 

NESTOR 

0 9140 

0 8991 

13 

ATT-2 

0 9131 

0-8995 

U 

IBM 

0 9114 

0 8959 

15 

KODAK J 

0 9073 

0 8917 

16 

SYMBUS 

0.9070 

0.8893 

17 

ATTJ 

0 9056 

0 8916 

18 

ATTJ 

0 9053 

0 8921 

19 

OCRSYS 

0 8967 

0 8848 

20 

GTESS.l 

0 8956 

0 8812 

21 

GTESS.2 

0 8946 

0.8798 

22 

MIME 

0 8793 

0 8631 

23 

NIST-4 

0 8774 

0-8602 

24 

ASOL 

0 8673 

0.8518 

25 

REI 

0 8591 

0.8447 

26 

RISO 

0 8443 

0.8263 

27 

GMD.l 

0 8442 

0.8272 

28 

NIST.l 

0 8439 

0.8275 

29 

GMD.J 

0 8416 

0 8250 

30 

KAMAN.l 

0 8361 

0.8184 

31 

GMD.4 

0 8246 

0.8089 

32 

COMCOM 

0 8053 

0.7986 

33 

NISTJ 

0 8051 

0.7947 

34 

IFAX 

0 7898 

0 7727 

35 

K AMAN.3 

0,7894 

0 7728 

36 

KAMAN.2 

0 7798 

0 7633 

37 

NISTJ2 

0 7573 

0.7421 

38 

VALEN.l 

0 7496 

0 7309 

39 

GMD-2 

0 7457 

0 7283 

40 

KAMAN-4 

0 7174 

0.7010 

41 

KAMAN.5 

0 6481 

0.6360 

42 

UM!CH_2 

0 0458 

0 0204 

Table  66:  HUGHES.l  correlation  graph  key  for  uppers. 
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hughes  j j-owehcorrelate 


SYSTEM  NUMSER 

Figure  105:  HUGHES.l  - lower  case  correlation 


Syaiem  Number 

Sydtem  N%me 

Correlation  { all ) 

Correlation  (correct) 

1 

HUGHES. 1 

1 0000 

l.OOOO 

2 

HUGHES.2 

0 9800 

0 8363 

3 

VOTE31 

0 8715 

0.8215 

4 

REFERENCE 

0 8461 

0 8461 

h 

AEG 

0 8440 

0 7933 

6 

ERIM.l 

0 8404 

0 7893 

7 

UBOL 

0 8273 

0 7732 

8 

OCRSYS 

0 8267 

0.7810 

9 

IBM 

0 8252 

0-7747 

10 

UMICH.l 

0 8232 

0 7732 

LI 

NYNEX 

0 8209 

0 7795 

12 

KODAKU 

0 8183 

0 7738 

13 

ATT  J 

0 8154 

0.7747 

14 

ATT.4 

0 8153 

0.7734 

15 

ATTU 

0 8129 

0.7738 

16 

ATT  J 

0 8108 

0 7648 

17 

NESTOR 

0 8062 

0 7664 

18 

VOTEJ> 

0 8056 

0.7771 

19 

GTESS.l 

0 7937 

0.7493 

20 

GTESS J 

0 7822 

0 7393 

21 

NIST.4 

0 7817 

0.7323 

22 

NIST.l 

0 7744 

0 7364 

23 

RISC 

0 764  7 

0.7173 

24 

GMD.3 

0 7629 

0 7212 

25 

ASOL 

0 7S12 

0.7128 

26 

GMD.4 

0 7472 

0 7057 

27 

GMD.l 

0 7472 

0,7057 

28 

NISTJJ 

0 7293 

0,7105 

29 

GMD.2 

0 6917 

0.6536 

30 

KAMAN.l 

0 6762 

0 6332 

31 

VALEN.l 

0 6757 

0 6309 

32 

NISTJ 

0 6675 

0 6302 

33 

KAMAN.3 

0 6525 

0.6112 

34 

KAMAN.2 

0 6351 

0 5970 

35 

K AMAN-5 

0 5642 

0 5293 

36 

KAMAN.4 

0 5293 

0 4966 

37 

COMCOM 

0 5004 

0.4886 

38 

UMICHJ! 

0 1015 

0,0514 

Table  67:  HUGHES.l  correlation  graph  key  for  lowers. 
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SYSTEM:  HUGHES. 2 


PARTICIPANT : Tony  Baraghimian 

ORGANIZATION:  Hughes  Aircraft  Compauiy,  Missiles  Systems  Group, 
Canoga  Park,  CA 


FEATURES : 

CLASSIFICATION:  fusion  of  results  of  multiple  nonparametric 
algorithms  (neocognitron) 


HARDWARE:  single  Intel  i860  in  a Datacube  computer 


TRAINING: 

DIGITS 

UPPERS 

LOWERS 

DATABASE 

NIST  SPECIAL 

DATABASE 

3 

10000 

7800 

7800 

random 

STATUS : 

on  time 

RESULTS:  — DIGITS  — 

— UPPERS  — 

— LOWERS  — 

DATABASE 

REJ. 

ERR. 

REJ. 

ERR. 

REJ. 

ERR. 

TESTDATAl 

RATE 

RATE— 

RATE 

RATE— 

RATE 

RATE— 

0.00 

0 . 0486 

0.00 

0.0673 

0.00 

0.1559 

0.10 

0.0181 

0.10 

0.0332 

0.10 

0.1176 

0.20 

0 . 0068 

0.20 

0.0147 

0.20 

0.0781 

0.30 

0.0038 

0.30 

0.0092 

0.30 

0 . 0493 

0.40 

0.0022 

0.40 

0.0061 

0.40 

0.0307 

0.50 

0.0015 

0.50 

0.0045 

0.50 

0.0202 

OCR  RATE  (CPS) : DIGITS  UPPERS  LOWERS 


SYS  RATE: 
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CPU  RATE: 


21.00 


19.00 


19.00 


SYSTEM:  HUGHES_2 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 
[19] 
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ERROR  RATE  (») 


HUGHES  2 — DIGITS  UPPERS  LOWERS 


Figure  106:  Error  rate  versus  rejection  rate  for  HUGHES_2 


HUGHES  2 


Figure  107:  Error  rate  per  writer  of  HUGHES_2 
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HUQHE8_].IMaiT.CORRELATE 


SYSTEM  NUMBCR 


Figure  108:  HUGHES_2  - digit  correlation 


Sy<i«m  Number 

System  N&me 

Correl»tion  ( 4il ) 

Correiftiion  (correct) 

1 

HUGHES.2 

1 0000 

1.0000 

7 

HUGHES-1 

0 9915 

0 9479 

3 

VOTEJH 

0 9578 

0.9446 

4 

AEG 

0 9520 

0.9375 

b 

OCRSYS 

0 9519 

0 9451 

e 

REFERENCE 

0 9514 

0 9514 

7 

ERIM-1 

0 9491 

0 9344 

s 

VOTE.P 

0 9480 

0 9384 

9 

IBM 

0.9474 

0.9351 

10 

ERIM^ 

0.9469 

0.9328 

11 

ATT-2 

0.9464 

0 9340 

12 

ELSAGBJS 

0 9459 

0 9349 

13 

ELSAGB.2 

0 9464 

0 9345 

14 

ATTJ 

0.9434 

0 9342 

15 

KODAK.2 

0.9432 

0 9301 

16 

ATT.4 

0 9429 

0.9299 

17 

THINK.2 

0 9416 

0 9306 

18 

UBOL 

0 9396 

0 9272 

19 

NESTOR 

0 9395 

0 9268 

20 

ELSAGB.l 

0 9389 

0 9241 

21 

KODAKU 

0 9379 

0.9248 

22 

NIST.4 

0 9374 

0 9235 

23 

REI 

0 9372 

0 9278 

24 

SYMBUS 

0 9370 

0 9244 

25 

ATTJ 

0.9349 

0 9235 

26 

NYNEX 

0 9344 

0 9247 

27 

THINK.l 

0.9336 

0 9217 

28 

GTESS-1 

0 9207 

0 9080 

29 

COMCOM 

0 9203 

0 9168 

30 

GTESS.2 

0 9192 

0 9066 

31 

NIST-1 

0 9089 

0 8965 

32 

GMD.3 

0 9076 

0 8944 

33 

UPENN 

0 9031 

0.8874 

34 

ASOL 

0 9011 

0 8873 

35 

MIME 

0 9008 

0 8887 

36 

GMD-1 

0 9004 

0 8879 

37 

NIST-2 

0 8996 

0.8857 

38 

NIST-3 

0 8945 

0 8809 

39 

RISO 

0.8875 

0 8725 

40 

GMD-4 

0 8860 

0 8738 

41 

KAMAN-1 

0 8802 

0 8653 

42 

KAMAN-3 

0 8663 

0 8507 

43 

KAMAN-2 

0 8630 

0 8480 

44 

KAMAN-S 

0 8452 

0.8309 

45 

GMD-2 

0 8421 

0 8278 

46 

VALEN-2 

0 8360 

0 8228 

47 

IFAX 

0 8253 

0 8111 

48 

VALEN-l 

0 8157 

0 8010 

49 

KAMAN-4 

0 7891 

0 7759 

Table  68:  HUGHES-2  correlation  graph  key  for  digits. 
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HUQHEB_^UPPEILC0(WELATE 


SYSTEM  NUMBER 

Figure  109:  HUGHES_2  - upper  case  correlation 


Syilem  Number 

System  Nime 

Correlftlion  ( all ) 

Correlation  (correct) 

HUGHES.2 

1 0000 

1.0000 

2 

HUGHES-l 

0 9907 

0 9296 

3 

VOTE_M 

0 9389 

0.9230 

4 

REFERENCE 

0 9327 

0 9327 

5 

AEG 

0 9324 

0.9173 

6 

ERIM-l 

0 9237 

0 9063 

7 

ATT.4 

0 9216 

0 9060 

S 

UMICH-1 

0 9179 

0.9034 

9 

NYNEX 

0.9169 

0.9042 

10 

UBOL 

0 9166 

0 8975 

11 

VOTE-P 

0 9138 

0 9034 

12 

NESTOR 

0 9121 

0 8970 

13 

ATTJ 

0 9106 

0 8969 

14 

IBM 

0 9106 

0 8946 

16 

SYMBUS 

0 9064 

0-8872 

16 

KODAKU 

0 904  7 

0 8801 

17 

ATTJ 

0 9044 

0 8896 

18 

ATTa 

0 9032 

0 8897 

19 

OCRSYS 

0 8962 

0 8829 

20 

GTESS.1 

0 8940 

0 8791 

21 

GTESS J 

0 8924 

0 8776 

22 

MIME 

0 8778 

0 8612 

23 

NIST.4 

0 8769 

0.8684 

24 

ASOL 

0 8666 

0 8498 

26 

REI 

0.8687 

0.8433 

26 

GMD-1 

0 8440 

0 8261 

27 

NIST.l 

0 8427 

0 8261 

28 

RISO 

0 8426 

0 8246 

29 

GMD.3 

0 8414 

0 8238 

30 

KAMAN.l 

0 8349 

0 8168 

31 

GMD.4 

0 8241 

0 8076 

32 

COMCOM 

0 8040 

0.7971 

33 

NISTJ 

0 8034 

0.7931 

34 

IFAX 

0 7901 

0 7723 

36 

KAMANJ 

0 7881 

0.7714 

36 

KAMAN-2 

0 7792 

0.7622 

37 

NISTJ 

0 7571 

0 7409 

38 

VALEN.1 

0.7479 

0.7293 

39 

GMD-2 

0.7464 

0.7277 

40 

KAMAN.4 

0.7162 

0 6997 

41 

KAMAN-6 

0 6479 

0 6361 

42 

UMICHJ 

0 0468 

0 0203 

Table  69:  HUGHES_2  correlation  graph  key  for  uppers. 
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HUCHES_2J.OWEaCORRELATC 


SYSTEM  NUMBER 

Figure  110:  HUGHES_2  - lower  case  correlation 


Sy«iem  Number 

Sy*lem  N*me 

Correlation  ( all ) 

Correlation  (correct) 

1 

lHL)GHtS.2 

1 0000 

1 0000 

2 

HUGHES. 1 

0 9800 

0 8363 

2 

VOTEJH 

0 8679 

0 8190 

4 

REFERENCE 

0 8441 

0 8441 

b 

AEG 

0 8427 

0.7916 

6 

ERIM.l 

0 8399 

0 7882 

7 

UBOL 

0 8257 

0.7715 

8 

OCRSYS 

0 8252 

0 7794 

9 

UMICH.1 

0 8227 

0 7723 

10 

IBM 

0 8220 

0 7724 

11 

NYNEX 

0 8172 

0 7768 

12 

KODAKJ 

0 8163 

0-7719 

13 

ATT.4 

0 8130 

0 7718 

14 

ATTJ 

0 8128 

0 7728 

15 

ATTJ 

0 8105 

0 7717 

16 

ATTJ 

0 8070 

0 7622 

17 

NESTOR 

0 8052 

0 7654 

18 

VOTE-P 

0 8028 

0.7748 

19 

GTESS.l 

0,7917 

0.7473 

20 

GTESS  J 

0 7796 

0 7370 

21 

NIST.4 

0 7782 

0 7295 

22 

NIST.l 

0.7707 

0.7337 

23 

GMD.3 

0 7607 

0.7190 

24 

RISO 

0 7602 

0 7145 

25 

ASOL 

0 7512 

0 7119 

26 

GMD.4 

0.7459 

0 7040 

27 

GMD.l 

0 7459 

0 7040 

28 

NISTJ 

0 7251 

0 7074 

29 

GMD.2 

0 6889 

0 6511 

30 

VALEN.l 

0 6769 

0 6308 

31 

KAMAN.l 

0 6747 

0.6324 

32 

NISTJ 

0 6653 

0 6284 

33 

KAMAN.3 

0 6515 

0 6109 

34 

KAMAN.2 

0.6342 

0 5968 

35 

KAMAN.4 

0 5648 

0 5293 

36 

KAMAN.4 

0 5288 

0 4958 

37 

COMCOM 

0 4993 

0 4872 

38 

UMICH_2 

0 1008 

0.0503 

Table  70:  HUGHES  J2  correlation  graph  key  for  lowers. 
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SYSTEM:  IBM 


PARTICIPANT:  Dr.  K.  M.  Mohiuddin 

ORGANIZATION:  IBM  Almaden  Research  Center,  San  Jose,  CA 


FEATURES:  geometrical  and  zonal  patterns,  including  bending  points 

and  areas  of  significant  direction  change  around  the  contour. 

CLASSIFICATION:  3-layer  NN : 184  input  units  (96  for  bonding  points  and  88 

for  direction  changes),  40  hidden  units  (static  for  all 
experiments),  10  output  units  for  digits,  26  for  upper 
case  and  26  for  lover  case. 

HARDWARE:  RS/6000  Model  530  running  AIX 

TRAINING:  DIGITS  UPPERS  LOWERS  DATABASE 

80\7.  lOOW,  100\y.  NSDB3 

STATUS:  on  time 


RESULTS : 

— DIGITS  — 

— UPPERS  — 

— LOWERS  — 

DATABASE 

REJ. 

ERR. 

REJ. 

ERR. 

REJ. 

ERR. 

TESTDATAl 

RATE 

RATE— 

RATE 

RATE— 

RATE 

RATE— 

0.00 

0.0349 

0.00 

0.0641 

0.00 

0.1542 

0.10 

0.0071 

0.10 

0.0234 

0.10 

0.1061 

0.20 

0.0037 

0.20 

0.0090 

0.20 

0.0730 

0.30 

0.0038 

0.30 

0.0050 

0.30 

0.0482 

0.40 

0 . 0040 

0.40 

0.0054 

0.40 

0.0307 

0.50 

0.0038 

0.50 

0.0052 

0.50 

0.0183 

OCR  RATE 

(CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

86.95 

80.97 

89, 

,04 

CPU  RATE: 

200. 

194.17 

194.17 

193 


SYSTEM:  IBM 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 
[20][24][21][22][23]  [24]  [25][26] 
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NUMBER  WRITERS  WITH  ERROR 


CJ 

e- 

s 

a. 

o 

a. 

tx 

UJ 


IBM  — DIGITS  UPPERS  LOWERS 


REJECTION  RATE  {%) 


Figure  111:  Error  rate  versus  rejection  rate  for  IBM 


Figure  112:  Error  rate  per  writer  of  IBM 
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laNUMdT.COHRELA're 


SYSTEM  NUMeen 


Figure  113:  IBM  - digit  correlation 


Sysiem  Number 

Syatem  N»me 

Correlation  ( all ) 

Correlation  (correct) 

rsia 

l.OOOO 

1.0000 

2 

VOTE^ 

0 9677 

0 9557 

3 

REFERENCE 

0 9651 

0 9661 

4 

OCRSYS 

0 9641 

0.9577 

h 

ATTJ 

0.9583 

0.9460 

6 

AEG 

0.9578 

0.9464 

7 

VOTEJ* 

0,9570 

0.9483 

S 

ELSAGB J 

0 9556 

0.9460 

9 

ELSAGB^ 

0.9553 

0 9457 

10 

ERIM-1 

0 9549 

0.9432 

ll 

ATTa 

0.9535 

0.9456 

12 

KODAKS 

0.9529 

0 9408 

13 

ERIMJ 

0 9523 

0.9417 

14 

NESTOR 

0.9523 

0.9386 

15 

ATT. 4 

0.9514 

0.9404 

16 

THINK  J 

0 9501 

0 9407 

17 

REI 

0 9489 

0 9400 

IS 

KODAKJ 

0 9480 

0 9354 

19 

UBOL 

0.9477 

0.9374 

20 

HUGHES.2 

0.9474 

0 9351 

21 

HUGHES. 1 

0.9474 

0.9351 

22 

NYNEX 

0 9470 

0-9371 

23 

ELSAGB.l 

0,9447 

0.9325 

24 

SYMBUS 

0.9442 

0 9341 

25 

THINK.1 

0 9440 

0.9328 

26 

ATT  J 

0 9436 

0.9338 

27 

NIST.4 

0.9433 

0 9323 

26 

COMCOM 

0.9311 

0.9280 

29 

GTESS.1 

0 9272 

0.9170 

30 

GTESSJ 

0-9262 

0.9157 

31 

NIST.1 

0.9163 

0.9059 

32 

GMD.3 

0 9141 

0.9029 

33 

MIME 

0.9105 

0.8989 

34 

ASOL 

0.9085 

0 8965 

35 

GMD.l 

0 9075 

0.8967 

36 

UPENN 

0 9069 

0.8945 

37 

NIST.2 

0 9051 

0.8939 

38 

NISTJ 

0.9003 

0.8889 

39 

RISO 

0.8957 

0 8816 

40 

GMD.4 

0 8928 

0 8823 

41 

KAMAN.l 

0 8898 

0.8745 

42 

KAMANJ 

0 8737 

0.8584 

43 

KAMAN.2 

0 8701 

0.8558 

44 

KAMANJ 

0 8513 

0.8380 

45 

GMD.2 

0.8469 

0.8349 

46 

VALEN.2 

0 8403 

0.8296 

47 

IFAX 

0 8295 

0.8174 

48 

VALEN.1 

0.8209 

0 8086 

49 

KAMAN.4 

0 7945 

0.7824 

Table  71:  IBM  correlation  graph  key  for  digits. 
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IBII.UP<>ERCORRELATC 


SYSTEM  NUMBER 


Figure  114:  IBM  - upper  Ccise  correlation 


System  Number 

System  Name 

Correiaiton  ( al) ) 

Correlalion  (correct) 

Tbm 

1 0000 

1 0000 

2 

VOTE^ 

0.9430 

0 9272 

3 

REFERENCE 

0 9339 

0.9369 

4 

ATT.4 

0.9298 

0.9127 

b 

AEG 

0.9296 

0 9173 

6 

UMICH-1 

0.9274 

0,9106 

7 

ERiM.l 

0.9243 

0 9090 

8 

NYNEX 

0 9231 

0 9094 

9 

ATT-2 

0 9198 

0 9048 

10 

NESTOR 

0 9191 

0 9038 

11 

VOTE-P 

0.9178 

0 9081 

12 

UBOL 

0.9126 

0 8979 

13 

HUGHES. 1 

0.91 14 

0.8969 

14 

HUGHES.2 

0.9106 

0.8946 

13 

KODAKJ 

0.9101 

0,8947 

16 

ATTJ 

0-9087 

0 8949 

17 

SYMBUS 

0.9071 

0,8913 

IS 

ATTJ 

0.9039 

0 8934 

19 

OCRSYS 

0 8982 

0 8866 

20 

GTESS.1 

0 8926 

0.8802 

21 

GTESS J 

0 8914 

0.8791 

22 

MIME 

0 8860 

0,8673 

23 

NIST.4 

0 8773 

0 8617 

24 

ASOL 

0 8712 

0 8337 

23 

REI 

0 8634 

0-8490 

26 

RISO 

0 8328 

0 8331 

27 

KAMAN.l 

0 8482 

0 8269 

28 

GMD.l 

0 8470 

0.8308 

29 

GMD.3 

0 8463 

0 8290 

30 

NIST.l 

0 8449 

0 8302 

31 

GMD.4 

0.8277 

0 8124 

32 

NISTJJ 

0 8112 

0.8013 

33 

COMCOM 

0 8063 

0 8004 

34 

KAMAN.3 

0.7988 

0 7801 

33 

KAMAN.2 

0 7906 

0 7710 

36 

IFAX 

0.7893 

0 7746 

37 

NIST.2 

0 7607 

0,7463 

38 

VALEN-1 

0 7674 

0 7378 

39 

GMD.2 

0 7611 

0.7334 

40 

KAMAN.4 

0.7264 

0 7076 

41 

KAMAN.3 

0 6603 

0.6436 

42 

UMICH.2 

0 0408 

0 0193 

Table  72:  IBM  correlation  graph  key  for  uppers. 
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■ULOWEiUrOimELATE 


SVSTEli  NUMBER 

Figure  115:  IBM  - lower  case  correlation 


System  Number 

System  N«me 

Correiftiion  ( *11 ) 

Correlation  (correct) 

1 

rerra 

1 0000 

l.OOOO 

2 

VOTEJH 

0 8760 

0 8264 

3 

REFERENCE 

0.8467 

0.8467 

4 

ERIM-1 

0 8443 

0 7932 

6 

OCRSYS 

0.8342 

0 7863 

6 

AEG 

0 8339 

0 7897 

7 

UMICH.l 

0 8339 

0.7817 

8 

ATTJ 

0 8316 

0.7868 

9 

NESTOR 

0 8311 

0 7797 

10 

ATT-4 

0.8303 

0-7828 

11 

NYNEX 

0 8296 

0.7837 

12 

HUGHES. 1 

0.8252 

0 7747 

13 

ATTJ 

0.8224 

0 7793 

14 

HUGHES.2 

0.8220 

0.7724 

15 

UBOL 

0 8184 

0.7706 

16 

ATTJ 

0 8183 

0 7699 

17 

KODAKJ 

0 8160 

0 7749 

18 

VOTEJ> 

0 8114 

0 7826 

19 

GTESS.l 

0.8023 

0 7666 

20 

GTESSJ 

0 7900 

0.7466 

21 

NIST.l 

0 7828 

0 7413 

22 

RISO 

0.7811 

0.7272 

23 

NIST.4 

0.7762 

0.7312 

24 

GMD.3 

0.7636 

0 7236 

26 

ASOL 

0 7633 

0.7203 

26 

GMD.4 

0 7489 

0.7088 

27 

GMD.l 

0.7489 

0.7088 

28 

NISTJ 

0 7337 

0 7142 

29 

GMD.2 

0 7097 

0 6646 

30 

KAMAN.l 

0.6848 

0.6390 

31 

VALEN.1 

0 6837 

0 6368 

32 

NISTJ 

0 6783 

0.6367 

33 

KAMAN.3 

0 6624 

0.6177 

34 

KAMANJ 

0 6436 

0.6032 

36 

K AMAN J 

0 6724 

0.6362 

36 

KAMAN.4 

0 6284 

0 4976 

37 

COMCOM 

0 4966 

0.4866 

38 

UMICHJ 

0 0922 

0 0482 

Table  73:  IBM  correlation  graph  key  for  lowers. 
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SYSTEM:  IFAX 


PARTICIPANT:  Leonid  Nilva 

ORGANIZATION:  InterFax,  Inc.,  Sunnyvale,  CA 

FEATURES:  Shape  and  Histogram  based, 

adaptively  selected  relevant  subset  of  over  500  features 

CLASSIFICATION:  series  of  adaptive  affine  transformations 

HARDWARE : 

TRAINING:  DIGITS  UPPERS  LOWERS  DATABASE 

? ? NA  NSDB3 

STATUS:  on  time 


RESULTS:  — DIGITS  — --  UPPERS  — — LOWERS  — DATABASE 


REJ. 

ERR. 

REJ. 

ERR.  REJ. 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE 

RATE—  RATE 

RATE- 

0.00 

0.1707 

0.00 

0.1960 

0.10 

0.1249 

0.10 

0.1498 

0.20 

0.0897 

0.20 

0.1198 

0.30 

0.0626 

0.30 

0.0974 

0.40 

0.0491 

0.40 

0.0794 

0.50 

0.0335 

0.50 

0.0648 

:R  rate  (CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

20.00 

12.00 

NA 

CPU  RATE; 

NOTE:  Few  details  of  features  or  classification  provided. 
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SYSTEM:  IFAX 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 
none 

COMMENTS: 

InterFax,  headqucirtered  in  Sunnyvale,  California,  develops  and  markets  an  integrated  family  of 
robust  application  development  tools  for  fax  information  processing. 

One  of  InterFax’s  new  products,  code-named  Harvest,  is  an  object-oriented  development  environ- 
ment that  automates  the  reading  and  entering  of  data  from  faxed  forms  into  host  transaction 
systems.  These  forms  can  have  hand-printed  numbers  or  letters,  machine-printed  characters,  mark 
sense  boxes,  graphics,  or  other  images.  Harvest  reads  and  interprets  forms  from  fax  machines 
or  scanners.  Once  a form  is  read  and  verified,  the  information  is  automatically  sent  to  the  host 
computer  apphcation  and  a fax  response  or  confirmation  is  generated. 

Harvest  wUl  be  available  for  commercial  use  in  the  fourth  quarter  of  1992.  Initial  release  of  the 
product  wUl  support  IBM  mainframe  and  AS/400  host  computers.  The  implementation  platform  is 
486  IBM  compatible  computers  with  OS/2  2.0  operating  system  and  C-f-i-  programming  language. 

The  hand-printed  character  recognition  used  in  the  First  Census  OCR  Systems  Conference  is  a 
prototype  algorithm,  one  of  a couple  that  InterFax  may  pursue.  The  engine  utilizes  geometric 
feature  extraction  and  modified  a k-nearest  neighbor  classifier. 
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ERROR  RATE  (%) 


IFAX  — DIGITS  UPPERS 


REJECTION  RATE  (%) 


Figure  116:  Error  rate  versus  rejection  rate  for  IFAX 


IFAX 


Figure  117:  Error  rate  per  writer  of  IFAX 
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FAX.OMrr.CORRELATE 


8V8TQI  NUMBER 


Figure  118:  IFAX  - digit  correlation 


System  Number 

System  Name 

Correlation  { all ) 

Correlation  (correct) 

1 

If-xV 

l.OOOO 

1 0000 

2 

VOTEJH 

0 8370 

0 8248 

3 

ERIM.1 

0 8340 

0 8185 

4 

AEG 

0 8336 

0 8194 

b 

ATT  J 

0 8319 

0 8181 

6 

ERIM^ 

0.8312 

0 8173 

7 

KODAK J 

0.8304 

0 8155 

8 

VOTEJ* 

0 8301 

0 8212 

9 

OCRSYS 

0.8298 

0 8241 

10 

ATT-4 

0 8297 

0 8152 

11 

IBM 

0.8295 

0 8174 

12 

REFERENCE 

0 8293 

0 8293 

13 

UBOL 

0 8293 

0 8143 

14 

NIST-4 

0 8289 

0 8118 

15 

ELSAGB-3 

0 8287 

0 8177 

16 

SYMBUS 

0 8286 

0 8134 

17 

ELSAGB J 

0 8281 

0 8172 

18 

NESTOR 

0 8279 

0 8131 

19 

ATTJ 

0 8275 

0 8172 

20 

THINK-1 

0 8273 

0 8114 

21 

KODAK-1 

0 8271 

0 8117 

22 

HUGHES. I 

0 8266 

0 8115 

23 

ATTJ 

0 8264 

0 8123 

24 

HUGHES. 2 

0 8253 

0 8111 

25 

NYNEX 

0 8252 

0 8127 

26 

THINK  J 

0 8249 

0 8139 

27 

ELSAGB.l 

0 8247 

0 8092 

28 

REI 

0 8221 

0 8128 

29 

GTESS.l 

0 8192 

0 8015 

30 

GTESS  J 

0 8183 

0 8007 

31 

GMD.3 

0 8113 

0 7915 

32 

MIME 

0 8082 

0 7881 

33 

UPENN 

0 8078 

0 7862 

34 

COMCOM 

0 8072 

0 8032 

35 

RISO 

0 8064 

0 7802 

36 

NIST.l 

0 8062 

0.7898 

37 

GMD.l 

0 8059 

0 7863 

38 

ASOL 

0 8057 

0 7860 

39 

NISTJ 

0 8044 

0 7843 

40 

NISTJ 

0 8029 

0 7825 

41 

KAMAN-1 

0 8003 

0 7734 

42 

GMD-4 

0 7945 

0 7744 

43 

KAMANJJ 

0 7878 

0 7603 

44 

KAMAN.2 

0 7848 

0 7581 

45 

KAMAN-5 

0.7653 

0.7412 

46 

GMD-2 

0 7631 

0-7403 

47 

VALEN-2 

0.7625 

0.7391 

48 

VALEN.1 

0 7479 

0 7196 

49 

KAMAN_4 

0 7267 

0.6971 

Table  74:  IFAX  correlation  graph  key  for  digits. 
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rAX-UPPOtCORHELATE 


SYSTEM  NUMBER 

Figure  119:  IFAX  - upper  case  correlation 


S yiiem  Num ber 

System  Name 

CorreiAlion  ( all ) 

Correlation  (correct) 

1 

ipTjf 

1 0000 

1.0000 

2 

VOTE3I 

0 8100 

0 7967 

3 

REFERENCE 

0 8040 

0.8040 

i 

AEG 

0 8007 

0 7889 

b 

ATT.< 

0.8004 

0.7846 

6 

UMICH.l 

0 7992 

0.7834 

7 

ERIM-1 

0,7949 

0.7814 

8 

NYNEX 

0,7947 

0 7828 

9 

ATT-2 

0 7946 

0.7799 

10 

UBOL 

0.7943 

0,7762 

11 

VOTEJ* 

0 7927 

0 7836 

12 

NESTOR 

0 7909 

0 7773 

13 

KODAK J 

0 7901 

0 7738 

14 

HUGHES.2 

0 7901 

0.7723 

13 

HUGHES. 1 

0 7898 

0,7727 

16 

IBM 

0.7893 

0,7746 

17 

ATTJ 

0 7883 

0 7731 

18 

SYMBUS 

0.7877 

0.7706 

19 

ATTJ 

0.7870 

0 7723 

20 

GTESS.l 

0 7822 

0,7649 

21 

GTESS J 

0.7808 

0 7640 

22 

OCRSYS 

0 7738 

0.7642 

23 

MIME 

0 7726 

0 7329 

24 

NIST.4 

0.7713 

0.7307 

23 

ASOL 

0 7638 

0,7449 

26 

REI 

0 7349 

0,7337 

27 

KAMAN.l 

0 7301 

0-7217 

28 

RISO 

0 74  79 

0 7241 

29 

NISTU 

0 7436 

0 7231 

30 

GMD.3 

0 7401 

0.7196 

31 

GMD.l 

0 7392 

0.7198 

32 

GMD.4 

0 7240 

0 7042 

33 

NISTJJ 

0.7140 

0.6981 

34 

KAMAN-3 

0 7081 

0,6824 

33 

COMCOM 

0.7068 

0 6993 

36 

KAMAN-2 

0 6973 

0 6731 

37 

NIST.2 

0.6773 

0 6336 

38 

GMD.2 

0 6686 

0.6429 

39 

VALEN.1 

0 6660 

0 6406 

40 

KAMAN-4 

0.6431 

0.6176 

41 

KAMAN-3 

0 3874 

0.3646 

42 

UMICH.2 

0 0837 

0 0137 

Table  75:  IFAX  correlation  graph  key  for  uppers. 
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No  Data  Available 


Figure  120:  IFAX  - lower  case  correlation 


There  no  d4tA  for  thi«  evaluation. 


Table  76:  IFAX  correlation  graph  key  for  lowers. 
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SYSTEM:  KAMAN.l 


PARTICIPANT:  Mark  G.  Costello 

ORGANIZATION:  Kaman  Sciences  Corporation,  Utica,  NY 
FEATURES : 

CLASSIFICATION: 

HARDWARE:  SPARC2,  multiuser 

TRAINING:  DIGITS  UPPERS  LOWERS  DATABASE 

800  2080  2080  NSDBl? 

STATUS:  on  time 

RESULTS:  — DIGITS  — — UPPERS  — — LOWERS  — DATABASE 


REJ. 

ERR. 

REJ. 

ERR.  REJ. 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE 

RATE—  RATE 

RATE— 

0.00 

0.1146 

0.00 

0.1503  0.00 

0.3111 

OCR  RATE  (CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

0.50 

0.38 

0.43 

CPU  RATE: 


NOTE:  The  CON  files  for  the  KAMAN  systems  had  numbers  greater  than  1,  which  is  not  allowed 
by  the  NIST  scoring  package,  so  no  rejection-rate  data  was  calculated. 

NOTE:  No  details  of  recognition  algorithms  provided. 


205 


SYSTEM:  KAMAN.l 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 
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ERROR  RATE  (%) 


kaman  1 — digits  uppers  lowers 


Figure  121:  Error  rate  versus  rejection  rate  for  KAMAN.l 


Figure  122:  Error  rate  per  writer  of  KAMAN_1 
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KAHANJ  JMrrXOURCLATE 


SYSTEM  NUMBER 


Figure  123:  KAMAN_1  - digit  correlation 


System  Number 

System  N%me 

Correlation  (all) 

Correlation  (correct) 

1 

KAMAN-1 

1 0000 

1.0000 

2 

KAMAN^ 

0 9142 

0 8448 

3 

KAMAN^ 

0.9036 

0.8398 

4 

VOTE^ 

0 8998 

0 8827 

& 

AEG 

0 8937 

0 8762 

6 

ATT.4 

0 8934 

0 8739 

7 

ERIMa 

0.8930 

0.8746 

6 

VOTEJ> 

0.8924 

0 8804 

9 

ATTJ 

0.8922 

0 8762 

10 

KODAK J 

0 8918 

0 8733 

11 

NIST.4 

0.8914 

0 8694 

12 

IBM 

0.8898 

0 8746 

13 

NESTOR 

0.8898 

0.8713 

14 

ERIM.2 

0.8896 

0 8736 

IS 

KODAK-1 

0.8893 

0 8699 

16 

THlNK-1 

0.8887 

0 8687 

17 

ELSAGB-S 

0.8879 

0.8742 

IS 

ELSAGB.2 

0.8876 

0.8739 

19 

SYMBUS 

0.8873 

0 8693 

20 

ATT.S 

0.8869 

0.8696 

21 

ELSAGB-1 

0.8866 

0.8666 

22 

OCRSYS 

0 8864 

0.8800 

23 

UBOL 

0.8867 

0.8691 

24 

REFERENCE 

0.8864 

0.8864 

25 

ATTJ 

0 8846 

0 8730 

26 

GTESS-2 

0.8833 

0 8693 

27 

RISO 

0 8828 

0.8426 

28 

GTESS-1 

0.8821 

0 8693 

29 

HUGHES. 1 

0.8806 

0.8666 

30 

HUGHES.2 

0 8802 

0 8663 

31 

NIST-J 

0 8801 

0 8463 

32 

THINK.2 

0 8786 

0 8674 

33 

NYNEX 

0 8786 

0 8660 

34 

NIST.2 

0 8786 

0 84  70 

3S 

GMD-3 

0 8746 

0 8493 

36 

REI 

0 8743 

0.8666 

37 

NIST.l 

0.8741 

0.8498 

38 

MIME 

0.8733 

0.8469 

39 

ASOL 

0.8701 

0.8446 

40 

GMD.l 

0.8669 

0.8426 

41 

UPENN 

0 8640 

0.8387 

42 

KAMAN.5 

0.8610 

0.8127 

43 

COMCOM 

0.8667 

0 8527 

44 

GMD.4 

0.8620 

0 8289 

4S 

GMD.2 

0 8396 

0 8024 

46 

KAMAN.4 

0.8284 

0.7696 

47 

IFAX 

0 8003 

0.7734 

48 

VALEN-1 

0.7973 

0.7678 

49 

VALEN.2 

0 7961 

0 7782 

Table  77:  KAMAN.l  correlation  graph  key  for  digits. 
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KAMAN.I.UPPeaCOfWELATE 


SYSTEM  NUMBER 

Figure  124:  KAMAN.l  - upper  case  correlation 


System  Number 

System  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

KAMAN.l 

1 0000 

1 0000 

2 

KAMAN J 

0 8844 

0,7818 

3 

VOTEJkl 

0 8636 

0 8452 

4 

KAMAN,2 

0 8558 

0,7661 

6 

ATT_4 

0 8523 

0 8333 

6 

REFERENCE 

0 8497 

0 8497 

7 

UMICH.l 

0 8491 

0 8306 

8 

IBM 

0 8482 

0 8269 

9 

AEG 

0.8478 

0 8354 

10 

VOTE-P 

0 8470 

0 8360 

11 

NESTOR 

0 8436 

0 8268 

12 

ERIM.l 

0 8434 

0 8279 

13 

ATTJ 

0 84  29 

0 8261 

14 

NYNEX 

0 8400 

0 8272 

15 

UBOL 

0 8395 

0 8213 

16 

KODAKU 

0 8393 

0.8209 

17 

ATT  J 

0 8385 

0 8204 

18 

HUGHES-1 

0 8361 

0-8184 

19 

SYMBUS 

0 8361 

0,8175 

20 

HUGHES.2 

0 8349 

0.8168 

21 

attj 

0 8304 

0 8160 

22 

GTESS.l 

0 8240 

0 8076  . 

23 

GTESS_2 

0 8223 

0 8065 

24 

MIME 

0 8204 

0 7987 

25 

NIST.4 

0 8189 

0.7967 

26 

OCRSYS 

0 8180 

0.8070 

27 

RISO 

0 8153 

0,7803 

28 

ASOL 

0 8116 

0 7903 

29 

REl 

0,7991 

0 7803 

30 

GMD.l 

0.7949 

0,7708 

31 

GMD.3 

0.7927 

0.7685 

32 

NIST-1 

0 7921 

0.7689 

33 

GMD-4 

0 7763 

0 7532 

34 

NISTJ 

0.7729 

0 7499 

35 

KAMAN-4 

0 7566 

0 6932 

36 

IFAX 

0 7501 

0.7217 

37 

COMCOM 

0 7391 

0,7327 

38 

NISTJ 

0 7268 

0 7012 

39 

GMD-2 

0.7225 

0.6917 

40 

VALEN.l 

0.7219 

0 6901 

41 

KAMAN-5 

0 7U7 

0 6372 

42 

UMICH_2 

0 0648 

0 0149 

Table  78:  KAMAN.l  correlation  graph  key  for  uppers. 
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KAMANJJDWtRCOIIRELATE 


SYSTEM  NUMBER 


Figure  125;  KAMAN_1  - lower  case  correlation 


S y 5t^m  N u m ber 

System  Name 

Corretaiion  ( all ) 

Correlation  (correct) 

1 

kaMaN.i 

1.0000 

1 0000 

2 

KAMANJ 

0 9089 

0.6464 

3 

KAMANJ 

0 6382 

0 6208 

4 

VOTEJl 

0 7229 

0 6749 

6 

ATT.4 

0 7004 

0 6510 

6 

KODAKa 

0 694  7 

0 6460 

7 

AEG 

0 6899 

0 6487 

S 

ERIM.l 

0 6892 

0.6481 

9 

UMICH.l 

0 6890 

0 6428 

10 

REFERENCE 

0 6889 

0 6889 

11 

IBM 

0 6848 

0 6390 

12 

VOTE-P 

0 6834 

0 6568 

13 

ATTJ 

0 6825 

0 6442 

14 

ATTJ 

0 6822 

0.6355 

lb 

RISC 

0 6821 

0 6172 

16 

NYNEX 

0 6809 

0.6431 

17 

UBOL 

0 6802 

0 6338 

IS 

NESTOR 

0 6800 

0 6382 

19 

KAMANJ 

0.6787 

0.5261 

20 

HUGHES. 1 

0,6762 

0 6332 

21 

HUGHES.2 

0.6747 

0 6324 

22 

OCRSYS 

0 6744 

0 6381 

23 

N1ST.4 

0 6733 

0 6166 

24 

ATTJ 

0 6697 

0 6336 

23 

GTESS.l 

0 6619 

0.6208 

26 

NIST.l 

0 6604 

0 6181 

27 

GMD.3 

0 6584 

0 6108 

2S 

GTESSJ 

0 6571 

0 6157 

29 

ASOL 

0 6506 

0 6039 

30 

GMD.4 

0.6470 

0 5987 

31 

GMD.l 

0 6470 

0 5987 

32 

NIST  J 

0 641 1 

0 6094 

33 

KAMAN.4 

0 6373 

0.4952 

34 

GMDJ 

0.6262 

0 5715 

35 

VALEN.1 

0 6057 

0 5458 

36 

NIST-2 

0.5961 

0 5460 

37 

COMCOM 

0.4218 

0 4098 

38 

UMICH  J 

0,1265 

0 0362 

Table  79:  KAMANJ.  correlation  graph  key  for  lowers. 
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SYSTEM:  KAMAN_2 


PARTICIPANT:  Mark  G.  Costallo 

ORGANIZATION:  Kaman  Sciences  Corporation,  Utica,  NY 
FEATURES : 

CLASSIFICATION: 

HARDWARE:  SPARC2,  multiuser 

TRAINING:  DIGITS  UPPERS  LOWERS  DATABASE 

800  2080  2080  NSDBl? 

STATUS:  on  time 

RESULTS:  — DIGITS  — — UPPERS  — — LOWERS  — DATABASE 


REJ. 

ERR. 

REJ. 

ERR.  REJ. 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE 

RATE—  RATE 

RATE— 

0.00 

0.1338 

0.00 

0.2074  0.00 

0.3511 

OCR  RATE  (CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

0.76 

0.47 

0.47 

CPU  RATE: 
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SYSTEM;  KAMAN_2 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 
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NUMBER  WnTERS  WITN  ERROR  «•  E 


lcaman_2  — digits  uppers  lowers 

100.0  , , 1 , j , — 


30.00 


10.00  - 


3.000  - 


1.000 


0.300 


0.100  - 


0.030  - 


0.010 


1 r 


J L 


0.00  5.00  10.00  15.00  20.00  25.00  30.00  35.00  40.00  45.00  5C 

REJECTION  RATE  (%) 


Figure  126:  Error  rate  versus  rejection  rate  for  KAMAN_2 


KAMAN  2 


20  » 40  SO 

RECOONiTION  PERCENT  ERROR  E 


Figure  127:  Error  rate  per  writer  of  KAMAN_2 


.00 
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KAMAM.UNOirrOfmCLATE 


SYSTEM  NUMBER 


Figure  128:  KAMAN_2  - digit  correlation 


System  Number 

System  Name 

Correlation  ( all ) 

Correlation  (correct) 

K AM AN J 

1 0000 

l.OOOO 

2 

KAMAN.3 

0 9343 

0 9436 

3 

KAMAN.l 

0 9036 

0 8398 

4 

KAMAN^ 

0 9999 

0 8162 

6 

KAMAN.4 

0 9964 

0 7807 

6 

VOTEJV1 

0 9909 

0.8639 

7 

AEG 

0 9769 

0 8597 

9 

ERlM.l 

0 9742 

0 8561 

9 

NIST.4 

0 9742 

0 8520 

10 

VOTEJ’ 

0 9739 

0 9621 

1 1 

NESTOR 

0 9726 

0.8537 

12 

ATT.4 

0 9721 

0 9542 

13 

KODAK J 

0 9719 

0 8546 

14 

ATTJ 

0 9715 

0.9559 

IS 

;bm 

0 9701 

0 8556 

16 

ERIMJ 

0,9701 

0 9545 

17 

ELSAGBJ) 

0 9699 

0 8557 

19 

UBOL 

0 9699 

0 9519 

19 

ELSAGB^ 

0 9696 

0 9554 

20 

ELSAGB.l 

0.9695 

0 9490 

21 

KODAKJ 

0 8692 

0 9512 

22 

SYMBUS 

0 9695 

0 9512 

23 

ATTJ 

0 9690 

0 8512 

24 

THINK.l 

0 9679 

0.8494 

2S 

OCRSYS 

0 9671 

0 8609 

26 

REFERENCE 

0 9662 

0.8662 

27 

ATT  J 

0 9651 

0 8543 

29 

GTESS  J 

0 9642 

0 8408 

29 

HUGHES. 1 

0 9631 

0 8480 

30 

HUGHES.2 

0 9630 

0.8480 

31 

GTESS.l 

0 9626 

0 9406 

32 

RISO 

0 9605 

0 9233 

33 

THINKJ2 

0 9595 

0 8489 

34 

NISTJ 

0 8591 

0.8284 

35 

GMD.3 

0 9586 

0 8324 

36 

NYNEX 

0 9576 

0 9468 

37 

NIST.2 

0 8571 

0 8283 

39 

REI 

0 8564 

0 9476 

39 

NIST.l 

0 9556 

0 9319 

40 

MIME 

0 8530 

0.8276 

41 

ASOL 

0.8527 

0 9271 

42 

GMD.l 

0 8499 

0 8257 

43 

UPENN 

0 8423 

0.8205 

44 

COMCOM 

0 8386 

0 8353 

45 

GMD.4 

0 8363 

0 8123 

46 

GMD.2 

0 8249 

0 7868 

47 

VALEN.l 

0,7921 

0 7567 

49 

IFAX 

0.7949 

0 7581 

49 

VALEN J 

0.7834 

0.7656 

Table  80:  KAMAN_2  correlation  graph  key  for  digits. 
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KAHAN_2.UPf>En.CORRELATE 


SYSTEM  NUMBED 

Figure  129:  KAMAN_2  - upper  Ccise  correlation 


Syitiem  Number 

Syiiem  N^me 

Correi&iion  { all) 

Correlation  (correct) 

1 

k amaNj 

l.OOOO 

1.0000 

2 

KAMAN-3 

0.8915 

0 7546 

3 

KAMAN-l 

0.8558 

0.7661 

4 

KAMAN.4 

0 8382 

0.7018 

5 

VOTEJtl 

0 8069 

0 7889 

6 

ATT-4 

0.797T 

0.7789 

7 

VOTEJ* 

0 7935 

0,7826 

S 

UMICH.l 

0.7931 

0.7756 

9 

REFERENCE 

0 7926 

0.7926 

10 

AEG 

0 T922 

0 7802 

1 1 

NESTOR 

0 7916 

0 7736 

12 

IBM 

0 7905 

0.7710 

U 

ERIM-1 

0 7901 

0 7740 

14 

KODAKJ 

0.7884 

0.7685 

15 

ATTJ 

0 7880 

0 7684 

16 

ATTJ 

0.7870 

0 7712 

17 

UBOL 

0 7870 

0.7684 

18 

SYMBUS 

0.7862 

0 7659 

19 

NYNEX 

0 7856 

0.7731 

20 

HUGHES. 1 

0,7798 

0.7633 

21 

HUGHES.2 

0 7792 

0 7622 

22 

RISO 

0.7792 

0 7381 

23 

ATTJ 

0-7785 

0 7628 

24 

MIME 

0.7728 

0 7488 

25 

NIST.4 

0 7726 

0.7470 

26 

GTESS.l 

0 7725 

0 7554 

27 

GTESS  J 

0.7722 

0 7546 

28 

OCRSYS 

0 7672 

0.7555 

29 

ASOL 

0 7650 

0 7421 

30 

KAMAN.5 

0 7566 

0,6400 

31 

GMD.l 

0 7555 

0 7271 

32 

GMD.3 

0 7535 

0.7252 

33 

REI 

0 7533 

0 7328 

34 

NIST-1 

0 7494 

0 7229 

35 

NIST-3 

0 7374 

0 7100 

36 

GMD-4 

0 7360 

0,7097 

37 

VALEN.l 

0.6998 

0 6588 

38 

IFAX 

0 6973 

0 6731 

39 

GMD.2 

0 6954 

0 6586 

40 

NIST.2 

0.6953 

0 6640 

41 

COMCOM 

0 6890 

0 6833 

42 

UMICH.2 

0,0728 

0 0131 

Table  81:  KAMAN_2  correlation  graph  key  for  uppers. 
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kaman_^lowercorrelate 


SYSTEM  NUMBER 

Figure  130:  KAMAN_2  - lower  case  correlation 


Syilem  Number 

System  Name 

Correlation  ( all) 

Correlation  (correct) 

tvAMAN  J 

1 0000 

1 0000 

2 

KAMAN.l 

0 8382 

0 6208 

3 

KAMAN^ 

0 8320 

0 6060 

4 

KAMAN-4 

0 7484 

0 6091 

6 

KAMANJ 

0 6986 

0 6269 

6 

VOTE31 

0 6818 

0 6378 

7 

ATT.4 

0 6608 

0.6148 

S 

KODAKU 

0 6666 

0 6112 

9 

AEG 

0 6631 

0 6137 

10 

REFERENCE 

0 6489 

0 6489 

ll 

ERIM.l 

0 6482 

0 6116 

12 

UMICH.l 

0 6476 

0.6062 

13 

RISO 

0 6472 

0.6833 

14 

VOTEJ> 

0.6467 

0 6209 

16 

ATT  J 

0 6442 

0 6013 

16 

IBM 

0 6436 

0 6032 

17 

NESTOR 

0.6418 

0 6028 

IS 

UBOL 

0 6417 

0 6992 

19 

ATT  J 

0 6416 

0.6076 

20 

NYNEX 

0 6398 

0 6069 

21 

OCRSYS 

0 6362 

0 6022 

22 

NIST.4 

0 6360 

0 6839 

23 

HUGHES-1 

0.6361 

0,6970 

24 

HUGHES.2 

0.6342 

0.6968 

26 

ATT  J 

0 6311 

0 6983 

26 

GMD.3 

0 6261 

0.6800 

27 

NIST.I 

0 6266 

0 6847 

28 

GTESS.l 

0 6246 

0 6880 

29 

GTESS J 

0 6198 

0.6828 

30 

ASOL 

0 6149 

0 6710 

31 

GMD.4 

0 6148 

0 6684 

32 

GMD.l 

0 6148 

0 6684 

33 

NISTJl 

0 6083 

0.6766 

34 

GMD.2 

0 6921 

0.6398 

36 

valen.1 

0 6794 

0.6218 

36 

NIST-2 

0.6676 

0.6173 

37 

COMCOM 

0 3976 

0.3869 

38 

UMICH.2 

0.1227 

0.0334 

Table  82:  KAMAN_2  correlation  graph  key  for  lowers. 
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SYSTEM:  KAMAN.3 


PARTICIPANT:  Mark  G.  Costello 

ORGANIZATION:  Kaman  Sciences  Corporation,  Utica,  NY 
FEATURES : 

CLASSIFICATION: 

HARDWARE:  SPARC2,  multiuser 

TRAINING:  DIGITS  UPPERS  LOWERS  DATABASE 

800  2080  2080  NSDBl? 

STATUS:  on  time 

RESULTS:  — DIGITS  — — UPPERS  — — LOWERS  — DATABASE 


REJ. 

ERR. 

REJ. 

ERR.  REJ. 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE 

RATE—  RATE 

RATE— 

0.00 

0.1313 

0.00 

0.1978  0.00 

0.3355 

OCR  RATE  (CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

0.76 

0.47 

0.47 

CPU  RATE: 
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SYSTEM:  KAMAN.3 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 
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ERROR  RATE  (%) 


Icaman  3 — digits  uppers  lowers 


Figure  131:  Error  rate  versus  rejection  rate  for  KAMAN_3 


Figure  132:  Error  rate  per  writer  of  KAMAN_3 
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KAMAN_U)ICIT£0WIELATE 


SYSTEM  NUMBER 

Figure  133:  KAMAN_3  - digit  correlation 


Sy«iem  Number 

System  Name 

Correlation  ( all ) 

Correlation  (correct) 

KAMAN J 

l.OOOO 

I 0000 

2 

KAMANJ 

0.9343 

0 8436 

3 

KAMAN.l 

0 9142 

0 8448 

4 

VOTEJH 

0.8837 

0 8664 

5 

AEG 

0 8789 

0 8608 

6 

KAMAN^ 

0 8770 

0 8111 

7 

NIST.4 

0 8766 

0 8644 

6 

VOTEJ* 

0.8766 

0 8646 

9 

ERIM.l 

0.8764 

0.8683 

10 

NESTOR 

0.8763 

0.8669 

ll 

ATT  J 

0 8747 

0 8687 

12 

KODAK-2 

0 8747 

0.8672 

13 

ATT.4 

0.8746 

0.8666 

14 

IBM 

0 8737 

0 8684 

16 

ERIM.2 

0 8733 

0.8672 

16 

ELSAGB.J 

0 8723 

0 8682 

17 

ELSAGB.2 

0 8720 

0 8579 

18 

UBOL 

0 8720 

0 8642 

19 

ELSAGB.l 

0.8720 

0 8613 

20 

KODAK-1 

0.8718 

0.8638 

21 

THINK-1 

0 8710 

0 8626 

22 

SYMBUS 

0.8698 

0 8628 

23 

OCRSYS 

0 8697 

0 8633 

24 

ATT-3 

0 8696 

0 8630 

2S 

REFERENCE 

0 8687 

0 8687 

26 

ATT-1 

0 8679 

0 8669 

2T 

HUGHES-1 

0 8664 

0.8608 

28 

HUGHES-2 

0 8663 

0.8607 

29 

GTESS-2 

0 8662 

0 8432 

30 

GTESS.l 

0 8663 

0 8432 

31 

THlNK-2 

0.8630 

0.8618 

32 

RISO 

0 8626 

0.8268 

33 

NYNEX 

0 8614 

0 8496 

34 

GMD-3 

0 8610 

0 8361 

36 

NIST-3 

0.8604 

0.8303 

36 

REI 

0 8696 

0 8603 

37 

NIST-2 

0 8680 

0 8303 

38 

NIST-1 

0.8670 

0.8340 

39 

MIME 

0.8666 

0.8306 

40 

ASOL 

0.8641 

0 8291 

41 

GMD-l 

0 8620 

0 8280 

42 

UPENN 

0 8466 

0 8232 

43 

kaman.h 

0 8446 

0.7683 

44 

COMCOM 

0 8410 

0 837g 

46 

GMD-4 

0 8384 

0 8147 

46 

GMD-2 

0.8268 

0.7887 

47 

VALEN.1 

0 7926 

0.7584 

48 

IFAX 

0.7878 

0.7603 

49 

VALEN-2 

0 784  7 

0.7670 

Table  83:  KAMAN_3  correlation  graph  key  for  digits. 
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KAIMN_3.UPTCRCOflnELATC 


SVSmi  NUMBER 

Figure  134:  KAMAN_3  - upper  case  correlation 


System  Number 

Syilem  Name 

Correlation  ( all ) 

Correlation  (correct) 

KAMAN J 

l.OOOO 

l.OOOO 

2 

KAMANJ 

0 8915 

0,7546 

3 

KAMAN.l 

0 8844 

0.7818 

4 

VOTE31 

0 8151 

0 7983 

& 

ATT.4 

0 8053 

0.7875 

6 

REFERENCE 

0 8022 

0 8022 

7 

UMICH-1 

0 8018 

0.7849 

a 

NESTOR 

0 8013 

0.7837 

9 

VOTEJ* 

0 8012 

0 7910 

10 

AEG 

0 8007 

0 7892 

1 1 

IBM 

0.7988 

0 7801 

12 

ERIM.l 

0.7983 

0.7833 

13 

ATT-2 

0 7973 

0.7813 

14 

UBOL 

0 7968 

0.7774 

ih 

KODAKJ 

0.7962 

0.7769 

16 

ATTJ 

0,7951 

0.7767 

17 

NYNEX 

0 7949 

0 7827 

18 

SYMBUS 

0.7934 

0.7741 

19 

HUGHES. 1 

0,7894 

0 7728 

20 

HUGHES.2 

0.7881 

0.7714 

21 

ATTJ 

0 7872 

0 7723 

22 

RISO 

0 7835 

0 7449 

23 

GTESS.1 

0 7807 

0 7643 

24 

NIST.4 

0 7797 

0.7558 

25 

GTESS.2 

0 7792 

0.7633 

26 

MIME 

0 7782 

0.7560 

27 

OCRSYS 

0 7756 

0.7646 

28 

ASOL 

0 7TU 

0.7495 

29 

REI 

0 7597 

0 7399 

30 

GMD.l 

0 7597 

0 7339 

31 

KAMAN.S 

0 7594 

0 6348 

32 

NIST.l 

0 7570 

0 7315 

33 

GMD-3 

0.7564 

0 7313 

34 

KAMAN.4 

0 7499 

0 6728 

35 

GMD.4 

0.7406 

0 7166 

36 

NISTJ 

0 7406 

0 7164 

37 

IFAX 

0 7081 

0 6824 

38 

VALEN.l 

0 6998 

0 6628 

39 

COMCOM 

0 6997 

0 6937 

40 

NIST.2 

0.6990 

0 6706 

41 

GMD.2 

0 6963 

0 6633 

42 

UMICH-2 

0.0718 

0.0137 

Table  84:  KAMAN_3  correlation  graph  key  for  uppers. 
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KAMAN_XLOWERCOI«nELATE 


SYSHTEM  NUUBEfl 

Figure  135:  KAMAN_3  - lower  case  correlation 


System  Number 

System  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

KAMAN.3 

1 0000 

1 0000 

2 

KAMAN.l 

0 9089 

0 6464 

Z 

KAMANJ 

0 8320 

0.6050 

4 

VOTE^ 

0 6968 

0 6508 

5 

KAMAN.i 

0 6873 

0 5186 

6 

ATT-4 

0.6750 

0 6283 

7 

KODAK J 

0 6685 

0 6230 

8 

AEG 

0 6677 

0 6266 

9 

UMICH.l 

0.6670 

0 6216 

10 

ERIM.l 

0 6668 

0 6262 

11 

REFERENCE 

0 6645 

0 6645 

12 

IBM 

0 6624 

0.6177 

13 

VOTEJ’ 

0 6597 

0 6345 

14 

ATT  J 

0.6597 

0 6146 

15 

ATT.2 

0 6596 

0 6227 

16 

NESTOR 

0 6593 

0 6179 

17 

RISC 

0 6593 

0 5964 

18 

NYNEX 

0 6566 

0 6209 

19 

UBOL 

0.6562 

0 6116 

20 

OCRSYS 

0 6529 

0.6169 

21 

HUGHES. I 

0 6525 

0.6112 

22 

NIST.4 

0 6516 

0 5969 

23 

HUGHES.2 

0 6515 

0.6109 

24 

ATT  J 

0 64«l 

0 6118 

25 

GTESS.l 

0 6413 

0 6007 

26 

NIST.l 

0 6388 

0 5978 

27 

GMD.3 

0.6373 

0.5907 

28 

GTESS.2 

0 6363 

0 5949 

29 

ASOL 

0 6314 

0 5863 

30 

GMD.4 

0.6270 

0.5795 

31 

GMD.l 

0 6270 

0.5795 

32 

NIST  J 

0 6222 

0 5898 

33 

KAMAN.4 

0 6184 

0 4792 

34 

GMD.2 

0 6093 

0.5548 

35 

VALEN.1 

0 5921 

0.5312 

36 

NIST-2 

0 5806 

0 5297 

37 

COMCOM 

0 4097 

0 3981 

38 

UMICHJ 

0 1213 

0.0336 

Table  85:  KAMAN_3  correlation  graph  key  for  lowers. 
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SYSTEM:  KAMAN_4 


PARTICIPANT:  Mark  G.  Costello 

ORGANIZATION:  Kaman  Sciences  Corporation,  Utica,  NY 
FEATURES : 

CLASSIFICATION: 

HARDWARE:  SPARC2,  multiuser 

TRAINING:  DIGITS  UPPERS  LOWERS  DATABASE 

800  2080  2080  NSDBl? 

STATUS:  on  time 

RESULTS:  — DIGITS  — — UPPERS  — — LOWERS  — DATABASE 


REJ. 

ERR. 

REJ. 

ERR.  REJ. 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE 

RATE—  RATE 

RATE— 

0.00 

0.2072 

0.00 

0.2728  0.00 

0.4625 

OCR  RATE  (CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

0.94 

0.56 

0.56 

CPU  RATE: 
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SYSTEM;  KAMAN.4 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 
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ERROR  RATE  (%) 


Itaman  4 — digits  uppers  lowers 


REJECTION  RATE  (%) 


Figure  136; 


Error  rate  versus  rejection  rate  for  KAMAN_4 


Figure  137:  Error  rate  per  writer  of  K AM  ANA 
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KAMAN_4J)MfrX:OnRELATE 


SYSTEM  NUMBER 


Figure  138:  KAMAN_4  - digit  correlation 


System  Number 

System  N^me 

Correlation  ( all ) 

Correlation  (correct) 

KAMAN.4 

1 0000 

1 0000 

2 

KAMANJ 

0 8864 

0.7807 

3 

KAMAN^ 

0 8446 

0 7683 

4 

KAMAN.l 

0 8284 

0 7696 

5 

VOTEJV4 

0 8064 

0 7903 

6 

AEG 

0 8043 

0 7867 

7 

NIST-4 

0 8036 

0-7813 

8 

ERIM.I 

0 8016 

0 7844 

9 

ATT.4 

0.8000 

0.7826 

10 

VOTE-P 

0 7993 

0 7886 

L 1 

UBOL 

0.7990 

0.7809 

12 

ATT.2 

0 7983 

0.7834 

13 

KODAK.2 

0 7978 

0 7819 

14 

ERIM.2 

0 7976 

0 T825 

L& 

NIST  J 

0,7976 

0-7628 

16 

ATTJ 

0 7976 

0 7802 

17 

RISC 

0.7974 

0 7682 

IS 

NESTOR 

0 7971 

0-7807 

19 

ELSAGB.:) 

0 7968 

0 7834 

20 

ELSAGB.2 

0 7966 

0 7831 

21 

SYMBUS 

0.7966 

0 7800 

22 

ELSAGB.l 

0 7964 

0 7776 

23 

GTESS  J 

0 7963 

0 7715 

24 

THINK-1 

0.7962 

0 7780 

2S 

KODAKJ 

0 7966 

0 7788 

26 

NIST-2 

0 7960 

0.7624 

27 

GTESS-1 

0 7946 

0 7713 

28 

IBM 

0 7946 

0.7824 

29 

OCRSYS 

0 7933 

0 7876 

30 

REFERENCE 

0.7928 

0 7928 

31 

ATTU 

0.7922 

0.7821 

32 

GMD-3 

0 7916 

0 7646 

33 

HUGHES-1 

0 7894 

0 7760 

34 

HUGHES-2 

0 7891 

0 7769 

36 

MIME 

0 7882 

0 7606 

36 

THINK.2 

0 7866 

0.7766 

37 

ASOL 

0 7866 

0,7692 

38 

NIST-l 

0 7863 

0.7623 

39 

NYNEX 

0.7863 

0.7750 

40 

REI 

0.7832 

0 7766 

41 

GMD-1 

0.7832 

0.7686 

42 

KAMAN-S 

0 7763 

0 7307 

43 

UPENN 

0 7740 

0.7617 

44 

GMD-2 

0.7727 

0 7280 

46 

GMD_4 

0 7713 

0 7463 

46 

COMCOM 

0.7676 

0.7646 

47 

VALEN-1 

0.7310 

0 6964 

48 

IFAX 

0 7267 

0 6971 

49 

VALEN-2 

0 7180 

0.7006 

Table  86:  KAMAN-4  correlation  graph  key  for  digits. 
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KAMAN_4.UPPERC0IWELATE 


SVaTEM  NUMBER 

Figure  139:  KAMAN_4  - upper  case  correlation 


System  Number 

System  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

KAMAN.4 

1 0000 

1 0000 

2 

KAMAN-2 

0 8382 

0,7018 

3 

KAMAN.l 

0 7566 

0.6932 

4 

KAMAN.3 

0.7499 

0 6728 

b 

VOTE-M 

0 7397 

0.7232 

6 

ATT.4 

0 7351 

0 7162 

7 

AEG 

0 7290 

0,7169 

8 

ATTJ 

0 7286 

0.7078 

9 

VOTEJ 

0 7279 

0,7177 

10 

UMICH.l 

0.7275 

0 7108 

11 

REFERENCE 

0 7272 

0 7272 

12 

RISO 

0 7272 

0 6826 

13 

SYMBUS 

0.7267 

0 7056 

14 

IBM 

0 7264 

0-7076 

15 

UBOL 

0 7262 

0,7070 

16 

NESTOR 

0 7258 

0 7091 

17 

KODAK-1 

0 7249 

0.7054 

18 

ERIM.l 

0.7248 

0 7097 

19 

ATTJ 

0 7222 

0,7077 

20 

NYNEX 

0 7215 

0 7094 

21 

ATTJ 

0 7178 

0 7012 

22 

HUGHES. 1 

0.7174 

0.7010 

23 

MIME 

0 7163 

0.6904 

24 

HUGHES.2 

0 7162 

0 6997 

25 

NIST.4 

0.7159 

0 6880 

26 

GTESS J 

0 7U7 

0 6933 

27 

GTESS.l 

0 7116 

0 6939 

28 

ASOL 

0 7107 

0 6855 

29 

OCRSYS 

0 7047 

0 6945 

30 

GMD.l 

0.7012 

0 6710 

31 

GMD.3 

0.6998 

0 6696 

32 

NIST.1 

0.6986 

0.6691 

33 

REI 

0 6958 

0 6750 

34 

NISTJ 

0 6910 

0.6583 

35 

GMD.4 

0 6836 

0.6550 

36 

NIST.2 

0 6592 

0 6189 

37 

GMD.2 

0 6543 

0 6134 

38 

VALEN.1 

0.6519 

0 6074 

39 

IFAX 

0 6431 

0 6176 

40 

COMCOM 

0 6335 

0.6282 

41 

KAMAN.4 

0 5948 

0.5491 

42 

UMlCH-2 

0 0855 

0.0126 

Table  87:  KAMAN_4  correlation  graph  key  for  uppers. 
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KAMAN_4.ljOWERCOmELATE 


SYSTEM  NUMBER 

Figure  140:  KAMAN_4  - lower  case  correlation 


System  Number 

System  N^me 

Correifttion  { ^11 ) 

CorrelAlion  (correct) 

1 

kAMAM.4 

1 0000 

1 0000 

2 

KAMANJ 

0.7484 

0-A091 

Z 

KAMAN.l 

0 6373 

0.4952 

4 

KAMANJJ 

0 6184 

0 4792 

A 

VOTE^ 

0 AA72 

0 5249 

6 

ATT.4 

0 4477 

0.6107 

7 

RISO 

0 A463 

0 4889 

S 

KODAKU 

0.A461 

0 A091 

9 

AEG 

0 A438 

0 5117 

10 

NIST.4 

0 A388 

0.4908 

11 

UMICH-1 

0.A384 

0.5052 

12 

ATT  J 

0 A381 

0 5007 

13 

REFERENCE 

0.A37A 

0 5375 

14 

UBOL 

0.A372 

O.AOlO 

lA 

ERIM-1 

0 A3A6 

0 5058 

16 

ATTJ 

0 A334 

0 5045 

17 

NYNEX 

0.A311 

0.5022 

18 

VOTEJ* 

0.A306 

0.5124 

19 

HUGHES. 1 

0 A293 

0 4966 

20 

HUGHES.2 

0.A288 

0.4958 

21 

IBM 

0.A284 

0.4976 

22 

NESTOR 

0 A280 

0.4996 

23 

GMDJI 

0.A268 

0 4839 

24 

ATTJ 

0 A264 

0.4979 

2A 

OCRSYS 

0.A228 

0.4985 

26 

NIST.l 

0 A22A 

0.4866 

27 

GTESS J 

0 A224 

0 4880 

28 

GTESS.l 

0 A203 

0.4888 

29 

GMD.4 

0,AI92 

0.4757 

30 

GMD.l 

0.A192 

0.4757 

31 

ASOL 

0 A184 

0 4772 

32 

NISTJ 

0 A127 

0 4 796 

33 

GMD.2 

0 A033 

0 4527 

34 

VALEN-1 

0 4939 

0 4405 

3A 

NIST.2 

0 486A 

0 4370 

36 

KAMAN J 

0 4469 

0.3871 

37 

COMCOM 

0 3302 

0 3223 

38 

UMICH.2 

0.1213 

0.0243 

Table  88:  KAMAN_4  correlation  graph  key  for  lowers. 
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SYSTEM:  KAMAN.5 


PARTICIPANT:  Mark  G.  Costello 

ORGANIZATION:  Kaman  Sciences  Corporation,  Utica,  NY 
FEATURES : 

CLASSIFICATION: 

HARDWARE:  SPARC2,  multiuser 

TRAINING:  DIGITS  UPPERS  LOWERS  DATABASE 

800  2080  2080  NSDBl? 


STATUS:  on  time 

RESULTS:  — DIGITS  — --  UPPERS  — — LOWERS  — DATABASE 


REJ. 

ERR. 

REJ. 

ERR.  REJ. 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE 

RATE—  RATE 

RATE— 

0.00 

0.1513 

0.00 

0.3395  0.00 

0.4220 

OCR  RATE  (CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

6.57 

3.74 

3.74 

CPU  RATE: 
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SYSTEM:  KAMAN.5 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 
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HUMBER  WRITERS  WITM  ERROR 


Figure  141:  Error  rate  versus  rejection  rate  for  KAMAN_5 


Figure  142:  Error  rate  per  writer  of  KAMAN_5 
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KAIiAN_li)tGiri:OWIELATE 


SYSTEM  NUMBER 

Figure  143:  KAMAN_5  - digit  correlation 


System  Number 

System  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

KAMAN_S 

1 0000 

1 0000 

2 

KAMANJ 

0.8899 

0 8162 

3 

KAMAN^ 

0 8770 

0 8111 

i 

KAMAN.l 

0 8610 

0 8127 

5 

VOTE^ 

0 8590 

0.8448 

6 

NESTOR 

0 8561 

0 8364 

7 

AEG 

0 8540 

0 8391 

S 

VOTEJ> 

0 8522 

0 8423 

9 

ERIM.l 

0 8521 

0 8367 

10 

KODAKJ 

0.8514 

0 8361 

11 

IBM 

0 8513 

0 8380 

12 

ATTJ 

0 8508 

0.8375 

13 

ATT.4 

0 8503 

0 8352 

14 

NIST.4 

0 8503 

0 8319 

15 

ELSAGB J 

0 8501 

0 8371 

16 

ELSAGB J 

0 8499 

0.8369 

17 

KODAKU 

0 8497 

0.8330 

18 

ELSAGB.l 

0 8491 

0 8303 

19 

OCRSYS 

0 8489 

0 8431 

20 

REFERENCE 

0 8487 

0 8487 

21 

SYMBUS 

0 8484 

0 8327 

22 

ERIMJ 

0 8482 

0.8349 

23 

THINK.1 

0 8474 

0 8307 

24 

UBOL 

0 8467 

0 8321 

25 

ATTJ 

0 8463 

0 8363 

26 

ATT  J 

0 8455 

0 8314 

27 

HUGHES.2 

0 8452 

0 8309 

28 

HUGHES-1 

0 8443 

0 8304 

29 

THINK  J 

0 8420 

0 8317 

30 

NYNEX 

0 8414 

0 8303 

31 

GTESS.2 

0 8392 

0 8212 

32 

REI 

0 8389 

0.8305 

33 

GTESS.l 

0 8382 

0 8211 

34 

GMDJ 

0 8374 

0.8138 

35 

RISC 

0 8359 

0 8035 

36 

NIST.1 

0 8350 

0 8138 

37 

NISTJ 

0 8313 

0 8080 

38 

NISTJ 

0 8310 

0.8074 

39 

ASOL 

0 8302 

0 8080 

40 

GMD.l 

0 8297 

0 8074 

41 

MIME 

0-8279 

0 8071 

42 

COMCOM 

0 8223 

0.8189 

43 

UPENN 

0 8211 

0 8022 

44 

GMD.4 

0 8170 

0.7946 

45 

GMD.2 

0 7975 

0.7664 

46 

VALEN.l 

0 7828 

0 7440 

47 

KAMAN-4 

0 7763 

0.7307 

48 

VALEN.2 

0.7695 

0 7506 

49 

IFAX 

0 7653 

0.7412 

Table  89:  KAMAN_5  correlation  graph  key  for  digits. 
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KAIMN_S.UPPEacORRELATE 


SYSTEM  NUMBER 


Figure  144:  KAMAN_5  - upper  case  correlation 


Sy«tem  Number 

Syiiem  Name 

Correlation  (ail) 

Correlation  (correct) 

1 

KAMAN_b 

1 0000 

1 0000 

2 

KAMANJ 

0 7694 

0 6348 

3 

KAMAN-2 

0.7666 

0.6400 

4 

KAMAN.l 

0 7117 

0.6372 

S 

VOTEJVl 

0 6690 

0.6660 

« 

NESTOR 

0 6648 

0 6473 

7 

ATT-4 

0 6616 

0 6469 

9 

REFERENCE 

0.6606 

0.6606 

9 

IBM 

0 6603 

0 6436 

10 

UMICH.l 

0 6692 

0.6468 

11 

VOTE-P 

0 6687 

0 6608 

12 

AEG 

0 6686 

0 6491 

13 

ERIM.l 

0 6666 

0 6441 

14 

ATT  .2 

0 6662 

0 6429 

IS 

NYNEX 

0 6661 

0 6460 

16 

KODAK J 

0 6646 

0 6399 

17 

SYMBUS 

0 6642 

0 6381 

18 

UBOL 

0.6631 

0 6386 

19 

ATT  J 

0 6616 

0 6384 

20 

HUGHES. 1 

0 6481 

0 6360 

21 

HUGHES-2 

0 6479 

0 6361 

22 

ATTJ 

0 6478 

0.6366 

23 

RISO 

0 6464 

0 6143 

24 

GTESS.l 

0 6443 

0 6303 

2S 

GTESSJ 

0 6434 

0.6294 

26 

MIME 

0 6422 

0 6234 

27 

OCRSYS 

0 6417 

0 6314 

28 

NIST.4 

0 6414 

0 6215 

29 

ASOL 

0,6360 

0 6189 

30 

GMD.l 

0 6319 

0 6073 

31 

REl 

0 6303 

0 6121 

32 

GMD.J 

0 6286 

0 6049 

33 

NIST-1 

0 6239 

0.6022 

34 

GMD.4 

0 6171 

0.6928 

36 

NISTJl 

0 6136 

0 6922 

36 

VALEN.l 

0 6987 

0 6696 

37 

KAMAN.4 

0 6948 

0.6491 

38 

IFAX 

0 6874 

0.6646 

39 

NIST.2 

0 6829 

0 6680 

40 

GMD.2 

0.6801 

0,6490 

41 

COMCOM 

0,6768 

0 6718 

42 

UMICH-2 

0 0927 

0.0113 

Table  90:  KAMAN_5  correlation  graph  key  for  uppers. 
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KA<MM_S.L0WERC0IWELATE 


•V8TEM  MUMBER 

Figure  145:  KAMAN_5  - lower  case  correlation 


Sy«tem  Number 

Syaiem  N^me 

Correlation  (all) 

Correlation  (correct) 

1 

KAMAN^ 

1 0000 

1.0000 

2 

KAMAN^ 

0 698S 

0.S269 

3 

KAMANJ 

0 6873 

0 S186 

4 

KAMAN.l 

0 6787 

0 S261 

S 

VOTE^ 

0 6000 

0 S641 

6 

ATT.4 

0 S823 

0 S420 

7 

KODAK J 

0 S787 

0.S397 

8 

REFERENCE 

0 S780 

0 S780 

9 

ERIM.l 

0 S763 

0.S423 

10 

NESTOR 

0 S761 

0.S374 

LI 

AEG 

0,S7S5 

0 6407 

12 

UMICH.1 

0 6727 

0 6363 

13 

IBM 

0.S724 

0 6362 

14 

ATT  J 

OS  709 

0 6332 

IS 

OCRSYS 

0 S707 

0 6366 

16 

NYNEX 

0.S693 

0.S378 

17 

ATTJ 

0 S680 

0.6386 

IS 

VOTE_P 

0 S679 

0 6470 

19 

UBOL 

0 S667 

0 6284 

20 

RISO 

0 S667 

0 6168 

21 

ATTJ 

0.S6S8 

0 6338 

22 

HUGHES.2 

0 S648 

0.6293 

23 

HUGHES. 1 

0 S642 

0.6293 

24 

NIST.4 

0 S602 

0 6136 

2S 

NIST.l 

0.SS62 

0.6202 

26 

GTESS.l 

0 SS27 

0.6203 

27 

GMD.3 

0 SS22 

0 6116 

28 

GTESSJ 

0.S490 

0.6166 

29 

GMD.4 

0 S438 

0 5020 

30 

GMD.l 

0 S438 

0 6020 

31 

ASOL 

0 S412 

0 6042 

32 

NISTJ 

0 S383 

O.Slll 

33 

VALEN.l 

0 S2S3 

0.4666 

34 

GMD.2 

0 S239 

0 4779 

3S 

NISTJ 

0 S0S9 

0 4600 

36 

KAMAN.4 

0 4469 

0 3871 

37 

COMCOM 

0.3S82 

0 3483 

38 

UMICHJ 

0.1276 

0 0333 

Table  91:  KAMAN_5  correlation  graph  key  for  lowers. 
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SYSTEM;  KODAK. 1 


PARTICIPANT;  Dr.  Arun  Rao 

ORGANIZATION;  Eastman  Kod2Lk  Company,  Rochester,  NY 


FEATURES;  Gabor  functions,  polynomial,  and  local  receptor  fields 

CLASSIFICATION;  four  layer  NN  with  local  receptive  fields  and 
proprietary  BP 

HARDWARE;  HP/Apollo  730 


TRAINING; 

DIGITS 

UPPERS 

LOWERS  DATABASE 

180000 

36000 

4400022  NSDB3 

STATUS ; 

on  time 

RESULTS;  — DIGITS  — 

— UPPERS  — 

— LOWERS  — DATABASE 

REJ. 

ERR. 

REJ. 

ERR. 

REJ. 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE 

RATE— 

RATE 

RATE— 

0.00 

0.0474 

0.00 

0.0692 

0.00 

0.1449 

0.10 

0.0151 

0.10 

0.0300 

0.10 

0.1014 

0.20 

0.0105 

0.20 

0.0131 

0.20 

0 . 0643 

0.30 

0.0105 

0.30 

0.0071 

0.30 

0.0388 

0.40 

0.0109 

0.40 

0.0043 

0.40 

0.0261 

0.50 

0.0110 

0.50 

0.0028 

0.50 

0.0148 

OCR  RATE  (CPS) 

; DIGITS 

UPPERS 

LOWERS 

SYS  RATE; 

14.82 

11.68 

8.95 

CPU  RATE; 

27.35 

12.47 

11.78 

NOTE:  Some  upper  case  characters  were  added  for  training  lowers. 
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SYSTEM:  KODAK J 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 

[5][27][28] 
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NUMBER  WRITERS  WITH  ERROR  E 


UJ 

H 

S 

fX 

o 

a 

cc 

Ui 


KODAK  1 — DIGITS  UPPERS  LOWERS 


REJECTION  RATE  (%) 


Figure  146:  Error  rate  versus  rejection  rate  for  KODAK_l 


KOOAKJ 


Figure  147:  Error  rate  per  writer  of  KODAK_l 
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KODAK.I  .OKIT.COmELATE 


SYSTEM  NUMBER 

Figure  148:  KODAK_l  - digit  correlation 


Sysiem  Number 

System  N^me 

Correlation  ( all ) 

Correlation  (correct) 

1 

KOD AKa 

1 0000 

l.OOOO 

2 

KODAKS 

0.984L 

0 9489 

VOTE^ 

0 9637 

0 9474 

4 

AEG 

0 9636 

0 9383 

5 

VOTEJ> 

0 9636 

0.9418 

6 

REFERENCE 

0 9626 

0 9626 

7 

ATT.4 

0 9626 

0.9361 

8 

ATTJ 

0 9623 

0.9371 

9 

OCRSYS 

0 9622 

0 9468 

10 

ERlM.l 

0 9604 

0 9349 

11 

ELSAGBJI 

0 9497 

0 9371 

12 

ELSAGBJ 

0 9492 

0.9368 

13 

ATT_l 

0 9484 

0 9371 

14 

IBM 

0 9480 

0.9364 

lb 

ERIMJ 

0.9476 

0 9336 

16 

ELSAGB.l 

0 9431 

0.9260 

17 

SYMBUS 

0 9430 

0.9273 

IS 

NESTOR 

0 9427 

0 9286 

19 

UBOL 

0 9426 

0 9290 

20 

THINK-l 

0 9409 

0 9265 

21 

NIST.4 

0 9409 

0.9266 

22 

ATTJ 

0 9403 

0 9267 

23 

THINK  J 

0 9397 

0 9296 

24 

HUGHES. 1 

0 9388 

0 9261 

2S 

HUGHES.2 

0 9379 

0 9248 

26 

NYNEX 

0 9374 

0 9266 

27 

REI 

0 9363 

0.9276 

28 

GTESS.l 

0.9279 

0 9126 

29 

GTESS.2 

0.9275 

0.9119 

30 

COMCOM 

0 9186 

0.9168 

31 

NIST.1 

0.9182 

0 9016 

32 

GMD.3 

0 9130 

0.8973 

33 

MIME 

0 9119 

0 8948 

34 

ASOL 

0 9093 

0.8921 

36 

NIST.2 

0 9070 

0 8899 

36 

UPENN 

0 9070 

0 8897 

37 

GMD.l 

0 9062 

0 8904 

38 

NISTJ 

0 9046 

0.8868 

39 

RISC 

0 8986 

0.8786 

40 

GMD.4 

0 8904 

0 8760 

41 

KAMAN.l 

0 8893 

0.8699 

42 

K AMANJ 

0.8718 

0.8638 

43 

KAMAN.2 

0 8692 

0.8612 

44 

KAMANJ 

0.8497 

0 8330 

46 

GMD.2 

0 8494 

0.8319 

46 

VALEN.2 

0.8393 

0 8247 

47 

IFAX 

0 8271 

0 8117 

48 

VALEN.l 

0 8172 

0 8020 

49 

KAMAN.4 

0 7966 

0 7788 

Table  92:  KODAK  J correlation  graph  key  for  digits. 
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KOOAKJ  .umaCOWCLATE 


SYSTEM  NUMBER 

Figure  149:  KODAKJ.  - upper  Ccise  correlation 


System  Number 

System  Name 

Correifttion  ( &11 ) 

Correlation  (correct) 

1 

KODAK_l 

1 0000 

1.0000 

2 

VOTE^ 

0.9439 

0 9249 

3 

ATT.4 

0 9374 

0 9140 

4 

REFERENCE 

0 9306 

0 9306 

b 

AEG 

0 9302 

0.9156 

6 

ERIM.l 

0 9237 

0 9066 

7 

VOTEJ* 

0 9229 

0.9106 

8 

ATTJ 

0 9225 

0 9039 

9 

NYNEX 

0.9203 

0 9060 

10 

SYMBUS 

0.9170 

0.6941 

11 

UBOL 

0.9166 

0 8977 

12 

UMICHU 

0 9160 

0 9022 

13 

NESTOR 

0 9144 

0 6992 

14 

ATTJ 

0 9143 

0 6957 

15 

ATTJ 

0 9141 

0 6046 

16 

IBM 

0.9101 

0.8947 

17 

HUGHES. 1 

0 9073 

0 8917 

18 

GTESS.l 

0 9064 

0 8664 

19 

GTESSJ 

0 9060 

0 8654 

20 

HUGHES.2 

0 9047 

0 8891 

21 

MIME 

0 6965 

0 8711 

22 

OCRSYS 

0 6925 

0 8620 

23 

ASOL 

0 6619 

0 8592 

24 

NIST.4 

0 6607 

0 8608 

25 

RISO 

0 6625 

0 8359 

26 

NIST.l 

0 6565 

0.8344 

27 

REI 

0 6555 

0 8428 

28 

GMD.l 

0 6491 

0 8304 

29 

GMD.3 

0 6476 

0 6287 

30 

KAMAN.l 

0.6393 

0 8209 

31 

GMD.4 

0 6292 

0 8118 

32 

NISTJ 

0 6241 

0 8056 

33 

COMCOM 

0 6007 

0,7949 

34 

KAMANJ 

0 7952 

0 7769 

35 

IFAX 

0 7901 

0.7738 

36 

KAMAN.2 

0 7664 

0 7685 

37 

NIST-2 

0 7720 

0 7500 

38 

GMDJ2 

0 7591 

0.7364 

39 

VALEN-1 

0 7464 

0.7303 

40 

KAMAN.4 

0.7249 

0.7054 

41 

KAMAN.4 

0 6546 

0.6399 

42 

UMICH-2 

0 0431 

0 0222 

Table  93:  KODAK_l  correlation  graph  key  for  uppers. 
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KOOAK_1.LOWEaCOIWELATE 


SYSTEM  NUMBER 

Figure  150:  KODAK_l  - lower  case  correlation 


System  Number 

Sy5iem  Name 

Corretation  ( ail ) 

Correlation  (correct) 

1 

kodaku 

1.0000 

1 0000 

2 

VOTEJVl 

0 8866 

0 8336 

3 

REFERENCE 

0 8551 

0 8551 

4 

ATT. 4 

0 8529 

0 7972 

5 

AEG 

0 8508 

0 8006 

« 

ERIM.l 

0 8445 

0 7961 

7 

ATTJ 

0 8357 

0,7911 

S 

NYNEX 

0.8320 

0.7888 

9 

UBOL 

0 *267 

0 7778 

10 

ATTa 

0 8261 

0.7853 

11 

NESTOR 

0 8235 

0 7791 

12 

ATTJ 

0 8199 

0 7735 

13 

OCRSYS 

0.8184 

0 7811 

14 

HUGHES.! 

0 8183 

0.7738 

15 

VOTEJ> 

0 8177 

0 7884 

16 

UMICHU 

0 8174 

0.7755 

17 

HUGHES.2 

0 8163 

0-7719 

18 

IBM 

0 8150 

0.7749 

19 

GTESSJ 

0 8058 

0 7612 

20 

GTESS  J 

0 8040 

0.7573 

21 

NIST.l 

0 7934 

0.7503 

22 

NIST.4 

0 7921 

0,7408 

23 

GMDJl 

0,7843 

0.7362 

24 

ASOL 

0,7788 

0.7320 

25 

RISC 

0 7761 

0 7279 

26 

GMD.4 

0 7673 

0 7199 

27 

GMD.l 

0.7673 

0.7199 

28 

NISTJ 

0 7485 

0.7257 

29 

GMD.2 

0 7081 

0.6662 

30 

KAMAN.l 

0 6947 

0 6460 

31 

VALEN.1 

0 6798 

0 6371 

32 

NISTJ 

0 6765 

0.6399 

33 

KAMAN.3 

0 6685 

0 6230 

34 

KAMAN.2 

0 6556 

0.6112 

35 

KAMAN-5 

0-5787 

0 5397 

36 

KAMAN.4 

0 5461 

0.5091 

37 

COMCOM 

0 5007 

0.4876 

38 

UMICH  J 

0 1031 

0 0591 

Table  94:  KODAK_l  correlation  graph  key  for  lowers. 


240 


SYSTEM:  K0DAK.2 


PARTICIPANT:  Dr.  Arun  Rao 


ORGANIZATION:  Eastmaoi  Kodak  Company.  Rochester,  NY 


FEATURES:  Gabor  functions,  polynomial,  and  local  receptor  fields 

CLASSIFICATION:  four  layer  NN  with  local  receptive  fields  and 
proprietary  BP 


HARDWARE : 

: HP/Apoll 

0 730 

TRAINING 

DIGITS 

UPPERS 

LOWERS  DATABASE 

180000 

NA 

NA 

NSDB3 

2310  sevens 

with  crosses 

INTERNAL 

STATUS : 

on  time 

RESULTS : 

— DIGITS  — 

— UPPERS  — 

— LOWERS  — 

DATABASE 

REJ.  ERR. 

REJ. 

ERR. 

REJ. 

ERR. 

TESTDATAl 

RATE  RATE— 

RATE 

RATE— 

RATE 

RATE— 

0.00  0.0408 

0.10  0.0117 

0.20  0.0037 

0.30  0.0021 

0.40  0.0015 

0.50  0.0016 

OCR  RATE  (CPS) : DIGITS  UPPERS  LOWERS 

SYS  RATE:  8.01  NA  NA 

CPU  RATE: 


NOTE:  Crossed  sevens  were  added  to  training  set  after  determining  need  for  them  from  results  of 
KODAK_L. 
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SYSTEM:  K0DAK_2 
BIBLIOGRAPHY: 

The  foUowing  references  have  been  provided  for  this  system: 
[5][27|[281 
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ERROR  RATE  (%) 


KODAK  2 ~ DIGITS 


REJECTION  RATE  (%) 


Figure  151:  Error  rate  versus  rejection  rate  for  KODAKS 


KOOAK.S 


Figure  152:  Error  rate  per  writer  of  KODAK_2 
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KOOAK.^OafT.CORRELATE 


SYSTEM  NUMBER 


Figure  153:  KODAKJ2  - digit  correlation 


System  Number 

System  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

KODAKS 

1.0000 

1.0000 

2 

KODAK-1 

0.9841 

0.9489 

Z 

VOTEJ^ 

0 9716 

0 9646 

4 

VOTE-P 

0 9606 

0 9482 

b 

AEG 

0.9606 

0 9448 

e 

REFERENCE 

0.9692 

0 9692 

7 

ATT.4 

0.9691 

0.9410 

a 

ATT.2 

0 9686 

0 9431 

9 

OCRSYS 

0 9684 

0.9621 

10 

ERIM.1 

0 9666 

0.9410 

11 

ELSAGB-3 

0 9660 

0.9427 

12 

ELSAGB.2 

0.9646 

0 9423 

13 

ATTJ 

0 9644 

0.9431 

14 

ER1M.2 

0.9639 

0 9396 

IS 

IBM 

0.9629 

0 9408 

16 

SYMBUS 

0 9494 

0 9337 

17 

UBOL 

0.9486 

0 9349 

18 

NESTOR 

0 9481 

0 9343 

19 

ELSAGB.l 

0 9480 

0 9313 

20 

NIST.4 

0 9471 

0 9314 

21 

THINK.1 

0 9466 

0 9312 

22 

ATTJ 

0 9460 

0 9323 

23 

THINK.2 

0 9460 

0.9362 

24 

HUGHES. 1 

0 9442 

0.9306 

2S 

HUGHES.2 

0 9432 

0 9301 

26 

NYNEX 

0 9429 

0.9323 

27 

REI 

0.9424 

0 9338 

28 

GTESS.1 

0 9322 

0.9170 

29 

GTESSJ2 

0.9318 

0.9162 

30 

COMCOM 

0.9244 

0.9218 

31 

NIST.l 

0.9227 

0.9063 

32 

GMD.3 

0.9172 

0.9019 

33 

MIME 

0 9160 

0.8992 

34 

ASOL 

0 9148 

0.8973 

3S 

NIST.2 

0 9127 

0.8962 

36 

UPENN 

0 9119 

0.8946 

37 

GMD.l 

0 9096 

0 8963 

38 

NISTJJ 

0 9093 

0.8912 

39 

RISO 

0.9017 

0.8821 

40 

GMD.4 

0 8946 

0 8807 

41 

KAMAN.l 

0.8918 

0.8733 

42 

KAMANJ) 

0 8T47 

0 8572 

43 

KAMAN.2 

0.8718 

0.8646 

44 

GMD-2 

0 8630 

0.8366 

4S 

KAMAN.i 

0.8614 

0.8361 

46 

VALENJ 

0.8409 

0 8279 

47 

IFAX 

0.8304 

0.8166 

48 

VALEN.1 

0.8207 

0 8061 

49 

KAMAN.4 

0.7978 

0.7819 

Table  95:  KODAK_2  correlation  graph  key  for  digits. 
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No  Data  Available 


Figure  154:  KODAKS  - upper  case  correlation 


There  W4«  no  for  thu  ev^u4iion 


Table  96:  K0DAK_2  correlation  graph  key  for  uppers. 
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No  Data  Available 


Figure  155:  KODAK_2  - lower  case  correlation 


There  waa  no  d&t4  for  lhi«  ev%lu4iion. 


Table  97:  K0DAK_2  correlation  graph  key  for  lowers. 
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SYSTEM;  MIME 


PARTICIPANT:  Francoise  Fogelman 

ORGANIZATION:  MIMETICS,  Chatenay  Malabry,  France 


FEATURES:  multi-layer  TDNN 
CLASSIFICATION:  LVQ 


HARDWARE;  SUN 

4,  SPARC 

1 

TRAINING: 

DIGITS 

UPPERS 

LOWERS  DATABASE 

45000 

16000 

STATUS : 

on  time 

, submitted  as  MIME. 

1 

RESULTS:  — DIGITS  — 

— UPPERS  — — LOWERS  — 

DATABASE 

REJ. 

ERR. 

REJ. 

ERR.  REJ. 

ERR. 

TESTDATAl 

RATE 

RATE— 

RATE 

RATE—  RATE 

RATE— 

0.00 

0.0857 

0.00 

0.1007 

0.12 

0.0361 

0.14 

0.0419 

0.14 

0 . 0298 

0.18 

0.0331 

0.17 

0.0276 

OCR  RATE  (CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

5.0 

3.0 

CPU  RATE: 


NOTE:  classification  is  effectively  nearest-neighbor 
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SYSTEM:  MIME 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 

1291130] 
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ERROR  RATE  (%) 


MIME  1 — digits  uppers 


REJECTION  RATE  (%) 


Figure  156;  Error  rate  versus  rejection  rate  for  MIME 


Figure  157:  Error  rate  per  writer  of  MIME 
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.ncrr.cofWELATE 


SYSTEM  NUMBER 

Figure  158:  MIME  - digit  correlation 


Syaiem  Number 

Sy«lem  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

MIME 

1 0000 

1 0000 

2 

VOTE-M 

0 9239 

0 9093 

3 

ATT.4 

0 9193 

0 9007 

4 

ATTJ 

0 9163 

0 9013 

5 

KODAKS 

0 9160 

0 8992 

6 

VOTEJ> 

0 9169 

0 9064 

7 

AEG 

0.9165 

0 9011 

S 

ERIM.l 

0 9160 

0 8992 

9 

THINK.l 

0 9160 

0 8962 

10 

REFERENCE 

0 9143 

0 9143 

1 1 

ATTa 

0 9139 

0.9017 

12 

OCRSYS 

0.9138 

0 9079 

13 

ELSAGB J 

0 9132 

0.9010 

14 

ELSAGB^ 

0.9128 

0 9006 

16 

KODAK_l 

0.9119 

0.8948 

16 

ERIMJ 

0.9111 

0 8976 

17 

NIST.4 

0 91 10 

0 8930 

18 

IBM 

0.9106 

0.8989 

19 

ATTJ 

0.9103 

0 8942 

20 

UBOL 

0 9081 

0 8941 

21 

SYMBUS 

0 9079 

0 8927 

22 

NESTOR 

0 9069 

0 8936 

23 

ELSAGB.l 

0 9069 

0 8902 

24 

THINKS 

0 9030 

0 8932 

26 

NYNEX 

0 9020 

0 8914 

26 

HUGHES. 1 

0 9014 

0.8890 

27 

HUGHES.2 

0 9008 

0 8887 

28 

GTESS  J 

0 9007 

0 8809 

29 

GTESS.l 

0 9004 

0 8814 

30 

REI 

0 8997 

0.8920 

31 

NIST.l 

0 8962 

0 8739 

32 

NIST_2 

0 8940 

0.8670 

33 

ASOL 

0 8913 

0 8664 

34 

NIST  J 

0 8889 

0 8624 

36 

RISO 

0 8882 

0 8671 

36 

GMDJ 

0 8874 

0 8683 

37 

COMCOM 

0 8831 

0 8800 

38 

GMD.l 

0.8800 

0 8620 

39 

UPENN 

0 8797 

0 8699 

40 

KAMAN.l 

0 8733 

0 8469 

41 

GMD.4 

0 8666 

0 8480 

42 

KAMAN.3 

0 8666 

0.8306 

43 

KAMANJ 

0 8630 

0 8276 

44 

GMD.2 

0 8391 

0.8124 

46 

KAMAN.i 

0 8279 

0.8071 

46 

VALEN.2 

0 8096 

0.7960 

47 

IFAX 

0 8082 

0.7881 

48 

VALEN.l 

0 7997 

0.7789 

49 

KAMAN_4 

0.7882 

0.7606 

Table  98:  MIME  correlation  graph  key  for  digits. 
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MUE.UPraLCORflELATE 


SYSTEM  NUMBER 


Figure  159:  MIME  - upper  case  correlation 


Syaiem  Number 

System  N%me 

Correlation  { Ul) 

Correlation  (correct) 

1 

MIME 

1 0000 

I 0000 

2 

VOTEJ^ 

0 9131 

0 8939 

3 

ATT.4 

0 9077 

0 8838 

4 

REFERENCE 

0 8993 

0 8993 

& 

AEG 

0 8977 

0.8841 

6 

KODAK_l 

0 896S 

0.8711 

7 

ATT  .2 

0 8962 

0.8766 

8 

VOTE-P 

0.8931 

0 8818 

9 

ERIM.1 

0 8911 

0,8750 

10 

UMICH.l 

0 8879 

0 8731 

11 

UBOL 

0 8867 

0.8681 

12 

NYNEX 

0 8864 

0 8734 

13 

IBM 

0 8860 

0 8673 

14 

SYMBUS 

0.8843 

0.8622 

IS 

ATTJ 

0 8839 

0.8653 

16 

NESTOR 

0 8828 

0 8683 

17 

ATT  J 

0 8813 

0,8643 

18 

HUGHES.! 

0 8793 

0 8631 

19 

HUGHES. 2 

0 8778 

0 8612 

20 

GTESS.l 

0 8764 

0.8558 

21 

GTESS.2 

0 87S1 

0 8546 

22 

NIST.4 

0.8669 

0 8401 

23 

ASOL 

0 8648 

0.8372 

24 

OCRSYS 

0 8627 

0 8521 

2S 

RISC 

0 8S8S 

0.8210 

26 

NIST.l 

0.84S3 

0 8149 

27 

GMD.l 

0 8341 

0 8095 

28 

REI 

0 8332 

0 8176 

29 

GMD.3 

0 8323 

0.8076 

30 

KAMAN.l 

0.8204 

0.7987 

31 

GMD.4 

0 8140 

0.7913 

32 

NISTJ 

0 8122 

0.7886 

33 

KAMANJ 

0 7782 

0 7560 

34 

COMCOM 

0,7764 

0.7709 

3S 

KAMAN.2 

0 7728 

0.7488 

36 

IFAX 

0 7726 

0.7529 

37 

NISTJ 

0 7673 

0 7375 

38 

GMD.2 

0-7S61 

0 7236 

39 

VALEN.l 

0 7349 

0 7113 

40 

KAMAN.4 

0,7163 

0.6904 

41 

KAMAN^ 

0 6422 

0 6234 

42 

UMICHJ 

0-0S32 

0.0193 

Table  99:  MIME  correlation  graph  key  for  uppers. 
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No  Data  Available 


Figure  160:  MIME  - lower  Ccise  correlation 

There  wad  no  d&ta  for  ihi«  evaluation 

Table  100:  MIME  correlation  graph  key  for  lowers. 
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SYSTEM:  NESTOR 


PARTICIPANT:  Christopher  L.  Scofield 
ORGANIZATION:  Nestor,  Inc.,  Providence,  RI 


FEATURES:  neocognitron  - convolution,  120  dimensional  featxire  vector. 

CLASSIFICATION:  MLP . Two  nets  used  in  pairallel,  outputs  averaged 
to  generate  overall  confidence  value.  Training  with 
gradient  descent. 


HARDWARE:  IBM  RS6000  Model  320H  development  environment 


TRAINING: 

DIGITS 

UPPERS 

LOWERS 

DATABASE 

40000 

20000 

20000 

NSDBl 

1800 

1800 

1800 

writers 

200000 

40000 

40000 

NSDBl 

1800 

1800 

1800 

writers 

15000 

0 

0 

INTERNAL 

800 

0 

0 

writers 

STATUS:  on  time 

RESULTS:  — DIGITS  — — UPPERS  — — LOWERS  — DATABASE 


REJ. 

ERR. 

REJ. 

ERR. 

REJ. 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE 

RATE— 

RATE 

RATE— 

0.00 

0.0453 

0.00 

0.0590 

0.00 

0.1539 

0.10 

0.0129 

0.10 

0.0240 

0.10 

0.1074 

0.20 

0.0050 

0.20 

0.0117 

0.20 

0 . 0704 

0.30 

0.0029 

0.30 

0.0068 

0.30 

0.0469 

0.40 

0.0016 

0.40 

0.0039 

0.40 

0.0325 

0.50 

0.0011 

0.50 

0.0025 

0.50 

0.0213 

OCR  RATE  (CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

14.10 

16.80 

13.10 

CPU  RATE: 
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SYSTEM:  NESTOR 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 
[19][31][34][32]  [36][37][10]  [33] [34] [35] [38] 
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NUMBER  WRITERS  WITH  ERROR 


U1 

s 

CL 

O 

CL 

CL 

Ui 


NESTOR  — DIGITS  UPPERS  LOWERS 


REJECTION  RATE  (%) 


Figure  161:  Error  rate  versus  rejection  rate  for  NESTOR 


NESTOR 


Figure  162:  Error  rate  per  writer  of  NESTOR 
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NESTOaOIGIT.CORRElATE 


SYSTEM  NUMBER 


Figure  163:  NESTOR  - digit  correlation 


Syalem  Num ber 

System  N*me 

Correlation  ( all ) 

Correlation  (correct) 

1 

NESTOk 

1 0000 

1 0000 

2 

VOTE-M 

0.9609 

0 9478 

3 

OCRSYS 

0 9332 

0 9483 

A 

REFERENCE 

0 9347 

0 9347 

5 

IBM 

0 9323 

0 9386 

6 

ATT  J 

0 9318 

0.9384 

7 

AEG 

0 9316 

0 9389 

S 

VOTE_P 

0 9313 

0.9419 

9 

ERIM.l 

0.9497 

0.9361 

10 

KODAK.2 

0 9481 

0 9343 

ll 

ELSAGBJJ 

0 9477 

0.9373 

12 

ELSAGB^ 

0 9471 

0.9368 

13 

ATT -4 

0 9466 

0.9334 

M 

ATTJ 

0 9439 

0.9372 

13 

ERIM^ 

0 9436 

0 9339 

16 

KODAKJ 

0 9427 

0 9286 

17 

UBOL 

0.9413 

0 9299 

18 

SYMBUS 

0 9406 

0 9283 

19 

THINK  J 

0.9404 

0.9311 

20 

ATT  J 

0.9404 

0.9282 

21 

HUGHES. 2 

0 9393 

0.9268 

22 

HUGHES. 1 

0 9393 

0 9268 

23 

NIST.4 

0 9392 

0 9260 

2A 

ELS  AGB.l 

0.9392 

0 9234 

23 

NYNEX 

0 9388 

0 9286 

26 

REI 

0 9386 

0.9303 

27 

THINK.1 

0 9384 

0 9261 

28 

GTESS.l 

0 9222 

0 9107 

29 

COMCOM 

0 9213 

0 9186 

30 

GTESSJ 

0.9213 

0 9096 

31 

GMD.3 

0.9122 

0 8982 

32 

NIST.l 

0 9117 

0 8998 

33 

MIME 

0.9069 

0 8933 

34 

ASOL 

0 9067 

0 8921 

33 

GMD.l 

0 9044 

0 8916 

36 

NIST.2 

0 9033 

0 8897 

37 

UPENN 

0 9013 

0 8884 

38 

NISTJJ 

0 8992 

0 8833 

39 

RISC 

0.8933 

0.8772 

40 

KAMAN.l 

0 8898 

0 8713 

41 

GMD.4 

0 8891 

0.8769 

42 

KAMANJ 

0.8733 

0 8339 

43 

KAMAN.2 

0 8726 

0.8337 

44 

KAMAN-3 

0.8361 

0.8364 

43 

GMD.2 

0 8466 

0 8316 

46 

VALENJ 

0 8373 

0.8243 

47 

IFAX 

0 8279 

0 8131 

48 

VALEN.l 

0 8231 

0 8068 

49 

KAMAN.4 

0.7971 

0 7807 

Table  101;  NESTOR  correlation  graph  key  for  digits. 
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Figure  164:  NESTOR  - upper  case  correlation 


Sy*tem  Number 

System  Name 

Correlation  (all) 

Correlation  (correct) 

1 

NESTOR 

1 0000 

1.0000 

2 

VOTE-M 

0 9466 

0.9310 

3 

REFERENCE 

0 9410 

0 9410 

4 

AEG 

0 9374 

0 9238 

b 

ATT-4 

0 9308 

0 9133 

6 

UMICH.l 

0.9297 

0.9141 

7 

ERIM.l 

0.9233 

0.9117 

g 

NYNEX 

0 9232 

0.9126 

9 

VOTE-P 

0 9223 

0.9132 

10 

ATTJ 

0 9224 

0.9080 

11 

UBOL 

0 9194 

0 9030 

12 

IBM 

0 9191 

0.9038 

13 

KODAK-1 

0 9144 

0.8992 

14 

HUGHES. 1 

0.9140 

0.8991 

13 

ATTJ 

0.9139 

0 8991 

16 

HUGHES.2 

0 9121 

0.8970 

17 

ATTJ 

0 9111 

0.8977 

18 

SYMBUS 

0 9082 

0 8943 

19 

OCRSYS 

0 9036 

0 8917 

20 

GTESS.1 

0.8983 

0 8833 

21 

GTESSJ2 

0.8968 

0 8840 

22 

MIME 

0.8828 

0.8683 

23 

NIST.4 

0.8813 

0 8638 

24 

ASOL 

0 8747 

0.8399 

23 

REI 

0 8663 

0.8327 

26 

RISO 

0 8343 

0.8361 

27 

GMD.I 

0 8308 

0.8330 

28 

GMD.3 

0 8497 

0 8333 

29 

NIST.l 

0.8492 

0 8344 

30 

KAMAN.l 

0 8436 

0 8268 

31 

GMD.4 

0.8318 

0 8166 

32 

NISTJ 

0 8122 

0 8020 

33 

COMCOM 

0 8063 

0 8014 

34 

KAMAN.3 

0 8013 

0.7837 

33 

KAMANJ 

0 7916 

0.7736 

36 

IFAX 

0 7909 

0.7773 

37 

NIST.2 

0.7604 

0.7473 

38 

VALEN.l 

0.7333 

0.7383 

39 

GMD-2 

0.7494 

0.7343 

40 

KAMAN.4 

0 7238 

0 7091 

41 

KAMAN.5 

0 6648 

0.64  73 

42 

UM1CH.2 

0.0389 

0.0203 

Table  102:  NESTOR  correlation  graph  key  for  uppers. 
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Figure  165;  NESTOR  - lower  case  correlation 


S y stem  Nu m 6er 

Sy»i«m  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

NESTOR 

1 0000 

1 0000 

2 

VOTEJ^ 

0 8668 

0 8225 

3 

REFERENCE 

0 8461 

0 8461 

4 

ERIM.l 

0 8351 

0 7889 

6 

AEG 

0 8331 

0 7900 

6 

IBM 

0 8311 

0.7797 

7 

ATT_4 

0 8301 

0,7837 

0 

NYNEX 

0 8276 

0.7837 

9 

ATT  J 

0 8235 

0 7818 

10 

ROD AK J 

0 8235 

0 7791 

1 1 

UMICH,1 

0 8221 

0 7761 

12 

OCRSYS 

0 8205 

0 7809 

13 

ATTJ 

0 8142 

0 7780 

14 

ATTJ 

0 8141 

0-7681 

15 

UBOL 

0 8137 

0 7701 

16 

HUGHES. 1 

0 8062 

0.7664 

17 

HUGHES. 2 

0 8052 

0-7654 

18 

VOTEJ> 

0 8030 

0-7779 

19 

GTESS.l 

0 7898 

0.7498 

20 

GTESSJ! 

0-7822 

0,7429 

21 

NIST.l 

0-7803 

0 7423 

22 

NIST.4 

0-7744 

0,7317 

23 

RISC 

0 7694 

0 7243 

24 

GMD.3 

0.7642 

0.7247 

25 

ASOL 

0.7617 

0 7222 

26 

GMD.4 

0 7485 

0 7095 

27 

GMD.l 

0.7485 

0.7095 

28 

NISTJ 

0 7322 

0 7143 

29 

GMD.2 

0 6977 

0.6598 

30 

VALEN.l 

0 6812 

0 6363 

31 

KAMAN.l 

0 6800 

0 6382 

32 

NIST.2 

0 6657 

0.6322 

33 

RAMAN J 

0 6593 

0.6179 

34 

RAMAN.2 

0 6418 

0 6028 

35 

K AMAN_5 

0 5761 

0 5374 

36 

KAMAN.4 

0 5280 

0 4996 

37 

COMCOM 

0 4933 

0 4822 

38 

UMICHJ 

0.0962 

0.0531 

Table  103:  NESTOR  correlation  graph  key  for  lowers. 
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SYSTEM:  NIST.l 


PARTICIPANT:  Patrick  J.  Grother 
ORGANIZATION:  NIST,  Gaithersburg,  MD 

PREPROCESSING:  Size  (preserving  aspect  ratio),  Slant  Normalization. 

Subtraction  from  binary  image  of  mean  of  training  images. 

FEATURES:  Projection  onto  principal  components  of  training  set. 

32  leading  elements  of  KL  transform. 

CLASSIFICATION:  K Nearest  Neighbour.  K not  fixed;  Distance  weighted  voting 

among  prototypes  within  1.1  ♦ distance  of  the  closest  prototype. 

HARDWARE:  AMT  510C  Array  (32x32)  Processor  with  Sparc  10  host. 


TRAINING: 

DIGITS 

UPPERS 

LOWERS 

DATABASE 

■22000 

■22000 

■22000 

NSDB3 

2100 

2100 

2100 

WRITERS 

STATUS:  On  time 

RESULTS:  — DIGITS  — — UPPERS  — — LOWERS  — DATABASE 


REJ. 

ERR. 

REJ. 

ERR. 

REJ. 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE 

RATE— 

RATE 

RATE— 

0.00 

0 . 0774 

0.00 

0.1385 

0.00 

0.1858 

OCR  RATE  (CPS) 

: DIGITS 

UPPERS 

LOWERS 

OCR  RATE: 

CPU  RATE:  4.8  4.8  4.8 
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SYSTEM:  NIST.l 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 

[39] 

COMMENTS:  NIST.l 

See  Cross  Validation  Section  on  Inadequacies  of  NIST  Special  Database  3 for  the  classification  of 
NIST  Test  Data  1. 

The  late  system  NIST_4  outperforms  this  system  on  digits  on  the  basis  of  further  preprocessing,  a 
larger  training  set,  and  more  KL  coefficients. 

Very  Slow  Classification. 

No  exemplcLT  pruning  or  aggregation. 

Does  not  suffer  from  ’’minority”  problems  of  perceptrons  (e.g.  crossed  sevens). 
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NUMBER  WRITERS  WITH  ERROR 


NIST  1 --  DIGITS  UPPERS  LOWERS 


REJECTION  RATE  (%) 


Figure  166:  Error  rate  versus  rejection  rate  for  NIST.l 


M8T_1 


Figure  167:  Error  rate  per  writer  of  NIST_1 
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Figure  168:  NIST.l  - digit  correlation 


System  Number 

Sy«iem  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

NIST.l 

1 0000 

1 0000 

2 

VOTEJVl 

0 9325 

0 9174 

3 

THINK.l 

0.9277 

0 9048 

4 

AEG 

0 9247 

0 9096 

5 

ATT-4 

0 9247 

0 9072 

6 

NIST.4 

0 9247 

0.9028 

7 

VOTEJ* 

0 9236 

0 9131 

8 

ATTJ 

0.9236 

0 9106 

9 

ELSAGBJ 

0 9231 

0 9092 

10 

ELSAGB-2 

0 9230 

0 9090 

1 1 

KODAK J 

0 9227 

0 9063 

12 

REFERENCE 

0.9225 

0 9225 

13 

OCRSYS 

0 921 1 

0 9156 

L4 

ATTJ 

0.9204 

0 9072 

15 

ERIM.1 

0.9199 

0 9067 

16 

ERIMJ 

0.9182 

0 9049 

17 

KODAK J 

0 9182 

0 9016 

18 

UBOL 

0.9170 

0 9018 

19 

IBM 

0 9163 

0 9059 

20 

SYMBUS 

0 9167 

0 9006 

21 

ELSAGB.l 

0 9166 

0.8981 

22 

ATT  J 

0 9132 

0 8996 

23 

NESTOR 

0 9U7 

0 8998 

24 

THINKS 

0 9108 

0.9012 

25 

HUGHES. 1 

0.9091 

0 8966 

26 

HUGHES.2 

0 9089 

0 8965 

27 

NYNEX 

0.9087 

0 8988 

26 

GTESS  J 

0 9080 

0.8876 

29 

REI 

0 9078 

0 8999 

30 

GTESS.l 

0 9077 

0 8882 

31 

N1ST.2 

0.9032 

0 8748 

32 

GMD.3 

0 8996 

0 8775 

33 

NISTJ 

0 8974 

0 8691 

34 

MIME 

0 8962 

0 8739 

35 

ASOL 

0 8942 

0*717 

36 

RISO 

0 8926 

0 8618 

37 

GMD.l 

0 8913 

0 8709 

38 

COMCOM 

0 8903 

0.8876 

39 

UPENN 

0 8846 

0 8664 

40 

GMD-4 

0 8763 

0 8664 

41 

KAMAN-l 

0 8741 

0 8498 

42 

KAMANJJ 

0.8570 

0 8340 

43 

KAMAN_2 

0 8556 

0 8319 

44 

GMD.2 

0 8406 

0.8156 

46 

KAMAN.i 

0 8350 

0 8138 

46 

VALEN.2 

0 8121 

0 7997 

47 

IFAX 

0.8062 

0.7898 

48 

VALEN.l 

0.8019 

0 7826 

49 

KAMAN.4 

0.7863 

0 7623 

Table  104:  NIST.l  correlation  graph  key  for  digits. 


262 


MBT.I.UPPERCOflHELATE 


SYSTEM  NUUaCil 


Figure  169:  NIST.l  - upper  case  correlation 


System  Number 

System  Name 

Corretation  ( all ) 

Correlation  (correct) 

NIST.l 

1 0000 

1 0000 

2 

VOTE_M 

0 8730 

0.8555 

3 

ATT.4 

0 8664 

0 8460 

4 

REFERENCE 

0 8615 

0.8615 

6 

AEG 

0 8585 

0 8457 

6 

KODAK-1 

0 8565 

0 8344 

7 

UBOL 

0 8557 

0.8346 

8 

VOTEJ* 

0 8551 

0 8453 

9 

ATTJ 

0.8550 

0.8338 

10 

ATTJ 

0.8529 

0 8364 

11 

CRlM.l 

0 8516 

0.8371 

12 

SYMBUS 

0 8504 

0 8291 

13 

NYNEX 

0 8498 

0.8375 

14 

NESTOR 

0 8492 

0.8344 

16 

UMICH-1 

0 8491 

0 8362 

16 

NIST.4 

0 8467 

0 8135 

17 

ATTJ 

0 8463 

0.8284 

18 

MIME 

0 8453 

0 8149 

19 

IBM 

0 8449 

0 8302 

20 

HUGHES. 1 

0.8439 

0.8275 

21 

GTESSU 

0 8429 

0 «217 

22 

HUGHES.2 

0 8427 

0.8261 

23 

GTESS J 

0 8404 

0 8203 

24 

RISC 

0 8338 

0 7925 

25 

ASOL 

0 8309 

0 8036 

26 

OCRSYS 

0 8277 

0 8166 

27 

GMD.l 

0 8239 

0 7892 

28 

GMD.3 

0 8215 

0 7870 

29 

GMD.4 

0 8035 

0 7711 

30 

REI 

0.8013 

0.7860 

31 

NISTJ 

0.7945 

0.7658 

32 

KAMAN.l 

0 7921 

0.7689 

33 

KAMANJ 

0 7570 

0.7315 

34 

NIST-2 

0 7569 

0 7193 

35 

KAMAN-2 

0.7494 

0 7229 

36 

COMCOM 

0 7474 

0.7421 

37 

IFAX 

0 7436 

0 7231 

38 

GMD.2 

0 7304 

0 6986 

39 

VALEN.I 

0.7111 

0 6850 

40 

KAMAN-4 

0 6986 

0 6691 

41 

KAMANJ 

0 6239 

0 6022 

42 

UMICHJ 

0 0616 

0 0185 

Table  105:  NIST.l  correlation  graph  key  for  uppers. 
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Figure  170:  NIST.l  - lower  case  correlation 


System  Number 

System  Name 

Correlation  ( all) 

Correlation  (correct) 

1 

NIST.l 

1.0000 

1 0000 

2 

VOTEJW 

0.8425 

0,7937 

3 

REFERENCE 

0 8142 

0 8142 

4 

ATTJ 

0 8036 

0 7554 

b 

ERIM.l 

0 8030 

0.7584 

6 

ATT-4 

0 8030 

0 7559 

7 

ATTJ 

0.7974 

0.7552 

8 

KODAKJ 

0 7934 

0 7503 

9 

AEG 

0.7924 

0.7542 

10 

NYNEX 

0.7897 

0.7498 

11 

UBOL 

0 7883 

0 7409 

12 

ATTJ 

0.7857 

0 7390 

13 

IBM 

0.7828 

0.7413 

14 

VOTE-P 

0 7822 

0 7573 

lb 

UMICH.I 

0.7820 

0.7408 

16 

OCRSYS 

0.7809 

0 7448 

17 

NESTOR 

0.7803 

0 7423 

18 

RISC 

0 7784 

0 7145 

19 

GTESS.l 

0.7763 

0.7307 

20 

HUGHES. 1 

0.7744 

0.7364 

21 

GTESSJ 

0 7740 

0 7262 

22 

HUGHES.2 

0 7707 

0.7337 

23 

NIST.4 

0,7699 

0.7144 

24 

GMD.3 

0 7692 

0.7132 

25 

ASOL 

0 7581 

0 7065 

26 

GMD.4 

0 7S42 

0 6981 

27 

GMD.l 

0 7542 

0.6981 

28 

NIST  J 

0 7493 

0 7122 

29 

GMDJ 

0 7031 

0 6515 

30 

NISTJ 

0 6919 

0 6340 

31 

KAMAN.l 

0 6604 

0 6181 

32 

VALEN.1 

0,6572 

0 6127 

33 

KAMANJJ 

0 6388 

0.5978 

34 

KAMAN-2 

0 6255 

0 5847 

35 

KAMAN.S 

0.5562 

0 5202 

36 

KAMAN.4 

0 5225 

0 4866 

37 

COMCOM 

0 4782 

0 4678 

38 

UMICH-2 

0.1046 

0.0534 

Table  106:  NIST.l  correlation  graph  key  for  lowers. 
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SYSTEM:  NIST_2 


PARTICIPANT:  Patrick  J.  Grother 


ORGANIZATION:  NIST,  Gaithersburg.  MD 

PREPROCESSING:  Size  (preserving  aspect  ratio),  Slant  Normalization. 

FEATURES:  Projection  onto  4 quadrant  gabor  wavelets.  One  frequency,  two  phases, 
four  angles  gives  32  element  "Gabor  Transform" . Least  squares 
fitting . 

CLASSIFICATION:  Scaled  conjugate  gradient  trained  32 :48 : {10 ,26}  perceptron. 
HARDWARE:  AMT  510C  Array  (32x32)  Processor  with  Sparc  10  host. 


TRAINING: 

DIGITS 

UPPERS 

LOWERS  DATABASE 

■102000 

”45000 

■46000  NSDB3 

2100 

2100 

2100  WRITERS 

STATUS:  On  time 

RESULTS:  — DIGITS  — 

— UPPERS  — 

— LOWERS  — DATABASE 

REJ. 

ERR. 

REJ. 

ERR. 

REJ. 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE 

RATE— 

RATE 

RATE— 

0.00 

0.0919 

0.00 

0.2310 

0.00 

0.3120 

0.10 

0.0519 

0.10 

0.1793 

0.10 

0.2657 

0.20 

0.0285 

0.20 

0.1384 

0.20 

0.2220 

0.30 

0.0150 

0.30 

0.1028 

0.30 

0.1805 

0.40 

0.0092 

0.40 

0.0728 

0.40 

0 . 1440 

0.50 

0.0060 

0.50 

0.0524 

0.50 

0.1078 

OCR  RATE  (CPS) 

: DIGITS 

UPPERS 

LOWERS 

OCR  RATE: 

CPU  RATE: 

81.0 

81. C 

I 

81.0 
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SYSTEM:  NIST^ 

BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 

[40] 

COMMENTS:  NIST_2 

See  Cross  Validation  Section  on  Inadequacies  of  NIST  Special  Database  3 for  the  classification  of 
NIST  Test  Data  1. 

Insufficient  / Inappropriate  Gabor  Bases. 
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NIST  2 — DIGITS  UPPERS  LOWERS 


Figure  171:  Error  rate  versus  rejection  rate  for  NIST_2 


M8T_2 


Figure  172:  Error  rate  per  writer  of  NIST_2 
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Figure  173:  NIST_2  - digit  correlation 


Sydiem  Number 

System  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

fMlsT.2 

1 0000 

1.0000 

2 

VOTEJH 

0 9201 

0 9046 

3 

ATT.4 

0 9172 

0 8973 

4 

NISTJ 

0.9138 

0.8715 

S 

ATTJ 

0 9134 

0 8974 

6 

VOTE_P 

0 9130 

0 9021 

7 

KODAKS 

0,9127 

0.8952 

% 

AEG 

0 9124 

0.8968 

9 

ERIM-1 

0.9117 

0.8954 

10 

THINK.l 

0 9097 

0 8900 

11 

NIST.4 

0 9094 

0 8896 

12 

ELSAGB-3 

0,9087 

0 8962 

13 

ATT  J 

0.9084 

0 8911 

L4 

ELSAGB^ 

0 9083 

0 8958 

15 

REFERENCE 

0 9081 

0 9081 

16 

ATTJ 

0 9080 

0 8961 

17 

ERIM^ 

0 9080 

0 8936 

IS 

OCRSYS 

0 9074 

0 9016 

19 

GTESSJ 

0 9071 

0 8810 

20 

KODAKJ 

0 9070 

0 8899 

21 

SYMBUS 

0 9068 

0 8902 

22 

GTESSU 

0 9068 

0 8816 

23 

ELSAGB.l 

0 9065 

0 8881 

24 

UBOL 

0.9069 

0.8902 

25 

IBM 

0.9051 

0 8939 

26 

NESTOR 

0 9035 

0,8897 

27 

NIST.l 

0 9032 

0.8748 

28 

HUGHES.2 

0 8996 

0,8857 

29 

HUGHES.! 

0 8995 

0 8857 

30 

THINK.2 

0 8964 

0 8876 

31 

NYNEX 

0 8956 

0.8859 

32 

REI 

0 8952 

0 8870 

33 

MIME 

0 8940 

0.8670 

34 

ASOL 

0 8929 

0 8655 

35 

RISC 

0.8909 

0 8561 

36 

GMD.3 

0 8870 

0 8661 

37 

UPENN 

0 8797 

0 8576 

38 

GMD-1 

0 8786 

0.8596 

39 

KAMAN.l 

0 8785 

0,8470 

40 

COMCOM 

0 8762 

0.8739 

41 

GMD.4 

0 8643 

0 8456 

42 

KAMAN.3 

0 8580 

0.8303 

43 

KAMAN.2 

0 8571 

0 8283 

44 

GMD.2 

0 8497 

0 8152 

45 

KAMAN.S 

0.8313 

0 8080 

46 

IFAX 

0 8044 

0 7843 

47 

VALEN.2 

0,8039 

0.7909 

48 

VALEN.l 

0.7972 

0 7766 

49 

KAMAN-4 

0 7950 

0.7624 

Table  107:  NIST_2  correlation  graph  key  for  digits. 
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Figure  174:  NIST_2  - upper  case  correlation 


System  Number 

System  Name 

Correlation  ( all) 

Correlation  (correct) 

1 

NISTJ 

1 0000 

1,0000 

2 

NISTJ 

0,7818 

0,7235 

3 

VOTE^ 

0,7780 

0,7634 

4 

ATT.4 

0 7770 

0.7586 

6 

RISO 

0.7765 

0 7253 

6 

KODAKJ 

0 7720 

0 7500 

7 

REFERENCE 

0 7690 

0 7690 

8 

SYMBUS 

0.7688 

0,7463 

9 

MIME 

0-7673 

0.7375 

10 

ATTJ 

0 7668 

0.751 1 

1 1 

AEG 

0 7667 

0 7561 

12 

VOTE_P 

0,7664 

0 7582 

13 

GTESS.l 

0.7643 

0.7414 

14 

ERIM.l 

0.7640 

0.7519 

15 

UBOL 

0 7633 

0.7465 

16 

ATT  J 

0 7631 

0 7452 

17 

GTESSJ 

0 7625 

0.7406 

18 

IBM 

0 7607 

0.7463 

19 

NESTOR 

0 7604 

0.7473 

20 

UMICH.1 

0 7598 

0.7478 

21 

ATTJ 

0 7586 

0 7441 

22 

NYNEX 

0 7579 

0.7485 

23 

HUGHES.! 

0 7573 

0.7421 

24 

HUGHES. 2 

0 7571 

0.7409 

25 

NIST.l 

0 7569 

0.7193 

26 

ASOL 

0 7540 

0 7270 

27 

NIST.4 

0 7535 

0 7278 

28 

OCRSYS 

0 7396 

0.7311 

29 

GMD.l 

0.7313 

0 7063 

30 

GMD.3 

0.7302 

0.7050 

31 

KAMAN.l 

0 7268 

0.7012 

32 

REI 

0 7232 

0 7083 

33 

GMD-4 

0 7142 

0 6904 

34 

GMD.2 

0 7063 

0 6569 

35 

K AM  AN-3 

0 6990 

0 6706 

36 

KAMAN.2 

0 6953 

0.6640 

37 

IFAX 

0 6773 

0 6556 

38 

COMCOM 

0 6697 

0.6660 

39 

KAMAN.4 

0 6592 

0 6189 

40 

VALEN.1 

0 6581 

0 6298 

41 

KAMAN.S 

0.5829 

0.5580 

42 

UMICHJ 

0.0672 

0.0157 

Table  108:  NIST_2  correlation  graph  key  for  uppers. 
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MST_2XOWERCOW«ELATE 


SYSTEU  NUMBER 

Figure  175;  NIST_2  - lower  case  correlation 


System  Number 

Sy«iem  Name 

Correiaiion  ( all ) 

Correlation  (correct) 

1 

NIiTj 

l.OOOO 

1 0000 

2 

RISC 

0 7193 

0 6332 

Z 

VOTE.M 

0 7U2 

0 6715 

4 

NIST  J 

0 7107 

0.6415 

& 

ATT.4 

0 6933 

0 6484 

6 

NIST.l 

0 6919 

0 6340 

7 

REFERENCE 

0 6880 

0 6880 

8 

ERIM.l 

0 6877 

0 6485 

9 

ATTJ 

0 6860 

0 6465 

10 

GTESS.1 

0 6843 

0 6331 

1 1 

IBM 

0 6783 

0 6357 

12 

GTESS  J 

0 6783 

0 6279 

13 

KODAK-1 

0 6765 

0 6399 

14 

ATT  J 

0 6756 

0 6342 

li 

VOTE_P 

0 6733 

0 6521 

16 

OCRSYS 

0 8722 

0 6364 

17 

UMICH.l 

0 6718 

0 6354 

18 

NYNEX 

0 6707 

0 6391 

19 

ATTJ 

0 6705 

0 6374 

20 

AEG 

0 6702 

0 6384 

21 

UBOL 

0 6684 

0 6291 

22 

HUGHES. 1 

0 6675 

0 6302 

23 

NESTOR 

0 6657 

0 6322 

24 

HUGHES.2 

0 6653 

0 6284 

25 

NIST.4 

0 6631 

0 6140 

26 

ASOL 

0 6581 

0 6076 

27 

GMD.3 

0 6538 

0 6083 

28 

GMD  J 

0 6515 

0 5825 

29 

GMD.4 

0 6395 

0 5954 

30 

GMD.l 

0 6395 

0.5954 

31 

KAMAN.l 

0 5961 

0 5460 

32 

VALEN.l 

0.5858 

0 5372 

33 

KAMAN J 

0.5806 

0 5297 

34 

K AMAN.2 

0 5675 

0 5173 

35 

K AMAN.5 

0 5059 

0.4600 

36 

KAMAN-4 

0 4865 

0.4370 

37 

COMCOM 

0 4168 

0 4075 

38 

UMICH  J 

0.1052 

0 0382 

Table  109:  NIST_2  correlation  graph  key  for  lowers. 


270 


SYSTEM:  NIST.3 


PARTICIPANT:  Patrick  J.  Grother 
ORGANIZATION:  NIST,  Gaithersburg,  MD 

PREPROCESSING:  Size  (preserving  aspect  ratio).  Slant  Normalization. 

Subtraction  from  binao'y  image  of  mean  of  training  images. 

FEATURES:  Projection  onto  principal  components  of  training  set. 

32  leading  elements  of  KL  transform. 

CLASSIFICATION:  Scaled  conjugate  gradient  trained  32: 48: {10, 26}  perceptron. 
HARDWARE:  AMT  510C  Array  (32x32)  Processor  with  Sparc  10  host. 


TRAINING: 

DIGITS 

UPPERS 

LOWERS 

DATABASE 

“77000 

"26000 

"26000 

NSDB3 

2100 

2100 

2100 

WRITERS 

STATUS:  On  time,  submitted  as  NIST_0 

RESULTS:  — DIGITS  — 

— UPPERS  — 

— LOWERS  — : 

DATABASE 

REJ. 

ERR. 

REJ. 

ERR. 

REJ. 

ERR. 

lESTDATAl 

RATE 

RATE— 

RATE 

RATE— 

RATE 

RATE— 

0.00 

0.0973 

0.00 

0.1693 

0.00 

0 . 2029 

0.10 

0.0529 

0.10 

0.1172 

0.10 

0.1521 

0.20 

0.0286 

0.20 

0.0757 

0.20 

0.1122 

0.30 

0.0160 

0.30 

0.0520 

0.30 

0.0853 

0.40 

0.0103 

0.40 

0.0331 

0.40 

0.0629 

0.50 

0.0070 

0.50 

0.0184 

0.50 

0.0458 

OCR  RATE  (CPS) : 

DIGITS 

UPPERS 

LOWERS 

OCR  RATE: 

CPU  RATE: 

142.6 

64.3 

64.3 
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SYSTEM:  NIST.3 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 

[41] 

COMMENTS:  NIST.3 

See  Cross  Validation  Section  on  Inadequacies  of  NIST  Special  Database  3 for  the  classification  of 
NIST  Test  Data  1. 

Small  training  set.  Small  number  of  KL  coefficients.  KL  basis  and  first  MLP  layer  both  perform  hn- 
ear  affine  transformation.  Therefore  premultiply  them.  Algorithmic  complexity  is  low:  dominated 
by  two  matrix  multiplies.  Very  fast. 
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ERROR  RATE  (%) 


NIST  0 — DIGITS  UPPERS  LOWERS 


REJECTION  RATE  (%) 


Figure  176:  Error  rate  versus  rejection  rate  for  NIST_3 


MST  0 


Figure  177:  Error  rate  per  writer  of  NIST_3 
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NnT.XOMfT^WCLATE 


SYSTEM  NUMBER 


Figure  178:  NIST_3  - digit  correlation 


Sy4t«m  Number 

Sy«iem  Name 

Correlation  ( all ) 

Correlation  (correct) 

NIST  J 

1 0000 

1 0000 

2 

VOTE_M 

0 9151 

0.8994 

3 

NISTJ 

0 9138 

0 8715 

4 

ATT.4 

0 91 14 

0 8915 

5 

AEG 

0 9096 

0 8928 

6 

KODAKS 

0 9093 

0 8912 

7 

ATTJ 

0.9089 

0 8923 

8 

VOTEJ» 

0 9081 

0.8971 

9 

ERIM.I 

0 9078 

0 8910 

10 

NIST.4 

0 9071 

0 8858 

1 1 

ATT  J 

0 9048 

0 8866 

12 

KODAKU 

0 9046 

0 8868 

13 

ELS  AGBJ) 

0 9044 

0 8913 

14 

GTESSJ 

0 9043 

0 8772 

15 

ELSAGB J 

0 9040 

0 8909 

16 

THINK.l 

0 9037 

0 8845 

17 

ATTU 

0 9032 

0 8912 

IS 

GTESS.l 

0 9032 

0 8773 

19 

ERIM  J 

0 9031 

0 8885 

20 

REFERENCE 

0 9027 

0 9027 

21 

OCRSYS 

0 9017 

0 8963 

22 

SYMBUS 

0 9017 

0 8850 

23 

UBOL 

0 9016 

0 8857 

24 

ELSAGB.l 

0.9010 

0 8830 

25 

IBM 

0 9003 

0 8889 

26 

NESTOR 

0 8992 

0 8853 

27 

NIST.l 

0 8974 

0 8691 

28 

HUGHES. 1 

0 8949 

0 8811 

29 

HUGHES.2 

0.8945 

0.8809 

30 

THINK.2 

0.8916 

0 8828 

31 

NYNEX 

0 8905 

0.8809 

32 

MIME 

0 8889 

0 8624 

33 

REI 

0 8887 

0 8812 

34 

RISO 

0 8882 

0 8526 

35 

ASOL 

0 8881 

0 8610 

36 

GMD.3 

0 8848 

0 8624 

37 

KAMAN.l 

0 8801 

0.8463 

38 

UPENN 

0 8763 

0-8537 

39 

GMD.l 

0 8759 

0 8656 

40 

COMCOM 

0 8713 

0 8689 

41 

GMD.4 

0 8617 

0 8417 

42 

KAMANJJ 

0 8604 

0.8303 

43 

KAMAN.2 

0 8591 

0.8284 

44 

GMD.2 

0 8482 

0.8126 

45 

KAMANJ 

0 8310 

0 8074 

46 

VALENJ 

0.8066 

0 7922 

47 

IFAX 

0 8029 

0.7825 

48 

KAMAN.4 

0 7976 

0 7628 

49 

VALEN.l 

0 7929 

0 7729 

Table  110:  NIST_3  correlation  graph  key  for  digits. 
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N«r_i.uf>peRconREUiTf 


SYSTEM  NUHBEH 

Figure  179:  NIST_3  - upper  Ccise  correlation 


System  Nu m ber 

System  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

NISTJ 

l.OOOO 

1.0000 

2 

VOTEJVl 

0 8318 

0 8210 

3 

REFERENCE 

0 8307 

0.8307 

4 

ATT. 4 

0-8307 

0 8148 

& 

KODAKJ 

0 8241 

0.8056 

6 

AEG 

0 8196 

0 8125 

7 

VOTEJ’ 

0 8189 

0.8137 

8 

ERIM.l 

0 8179 

0 8072 

9 

SYMBUS 

0.8179 

0 8006 

10 

UBOL 

0 8172 

0 8028 

11 

ATTJ 

0 8168 

0-8015 

12 

ATTJ 

0.8159 

0 8049 

13 

GTESS-1 

0.8125 

0.7944 

14 

NESTOR 

0 8122 

0.8020 

13 

MIME 

0 8122 

0,7886 

16 

GTESS J 

0 8117 

0 7934 

17 

NYNEX 

0 81 16 

0 8048 

18 

IBM 

0 8112 

0 8013 

19 

UMICH.I 

0 8100 

0 8024 

20 

ATTJ 

0.8076 

0.7972 

21 

RISO 

0 8062 

0 7692 

22 

HUGHES. 1 

0 8051 

0.7947 

23 

HUGHES.2 

0 8034 

0 7931 

24 

N1ST.4 

0 7991 

0 7803 

25 

ASOL 

0.7972 

0.7761 

26 

NIST.l 

0.7945 

0.7658 

27 

OCRSYS 

0 7908 

0.7853 

28 

NIST.2 

0.7818 

0 7235 

29 

KAMAN.l 

0 7729 

0.7499 

30 

GMD.l 

0 7719 

0.7537 

31 

GMD.3 

0 7705 

0.7526 

32 

REI 

0.7694 

0 7583 

33 

GMD.4 

0 7542 

0.7370 

34 

KAMAN-3 

0 7406 

0-7164 

35 

KAMAN.2 

0.7374 

0.7100 

36 

GMD.2 

0 7266 

0.6895 

37 

IFAX 

0.7140 

0.6981 

38 

COMCOM 

0.7133 

0.7119 

39 

valen.i 

0.6918 

0 6691 

40 

KAMAN.4 

0 6910 

0.6583 

41 

KAMAN.S 

0.6135 

0.5922 

42 

UMICHJ 

0.0601 

0 0204 

Table  111:  NIST_3  correlation  graph  key  for  uppers. 
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MST_U.OWERCORRELATE 


SV8TQI  NUMBER 

Figure  180:  NIST.3  - lower  case  correlation 


System  N um ber 

S y«lem  Name 

CorrelAtion  ( all ) 

Correlation  (correct) 

1 

NISTJ 

1 0000 

1 0000 

2 

REFERENCE 

0 79TI 

0.7971 

3 

VOTE.M 

0 7874 

0 7628 

4 

ATT.« 

0 7650 

0 7349 

S 

ATT-2 

0 7527 

0 7306 

6 

ERIM-X 

0.7521 

0 7307 

7 

NIST.l 

0 7493 

0 7122 

S 

KODAKJ 

0 7485 

0.7257 

9 

NYNEX 

0.7443 

0.7258 

10 

GTESS.l 

0 7429 

0 7107 

L 1 

RISC 

0 7418 

0.6964 

12 

GTESS J 

0.7413 

0.7079 

L3 

VOTE_P 

0 7412 

0.7312 

14 

AEG 

0 7394 

0.7235 

15 

ATTJ 

0.7383 

0.7210 

Id 

ATTJ 

0 7382 

0.7129 

17 

IBM 

0 7337 

0.7142 

18 

NESTOR 

0.7322 

0 7143 

19 

UBOL 

0 7305 

0 7105 

20 

UMICH.l 

0.7301 

0 7120 

21 

HUGHES. 1 

0.7293 

0 7105 

22 

OCRSYS 

0 7278 

0 7138 

23 

HUGHES.2 

0 7251 

0 7074 

24 

ASOL 

0 7187 

0 6867 

25 

NIST.4 

0 7171 

0 6888 

26 

GMD.3 

0 7129 

0 6849 

27 

NIST.2 

0 7107 

0 6415 

28 

GMD.4 

0.6983 

0 6709 

29 

GMD.l 

0.6983 

0 6709 

30 

GMD.2 

0 6850 

0 6412 

31 

KAMAN.l 

0 6411 

0 6094 

32 

K AMAN.3 

0 6222 

0 5898 

33 

VALEN.1 

0-6208 

0 5924 

34 

KAMAN.2 

0 6083 

0 5755 

35 

KAMAN-i 

0 5383 

0 5111 

36 

KAMAN.4 

0 5127 

0.4796 

37 

COMCOM 

0 4555 

0.4523 

38 

UMICHJ 

0.1058 

0 0629 

Table  112:  NIST_3  correlation  graph  key  for  lowers. 
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SYSTEM:  NYNEX 


PARTICIPANT:  Atul  Chhabra 

ORGANIZATION:  Nynei  Sciences  t Technology,  Inc.,  White  Plains,  NY 

FEATURES:  model  or  stroke,  automatic  feature  selection.  A large 
number  of  pre-segmentation  points  aure  first  generated. 
The  algorithm  effectively  chooses  a subset  of  them 
that  provide  the  most  confident  recognition. 

CLASSIFICATION:  MLP 


HARDWARE:  SPARC2  with  40  Mbyte  ram,  coded  in  C. 


TRAINING 

DIGITS 

IJPPERS 

LOWERS  DATABASE 

40000 

45000 

INTERNAL 

0 

0 

35000  NSDB3 

STATUS : 

on  time 

RESULTS : 

— DIGITS  — 

— UPPERS  — 

— LOWERS  — DATABASE 

REJ. 

ERR. 

REJ. 

ERR. 

REJ. 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE 

RATE— 

RATE 

RATE— 

0.00 

0.0432 

0.00 

0.0491 

0.00 

0.1403 

0.10 

0.0128 

0.10 

0.0175 

0.10 

0 . 0994 

0.20 

0.0052 

0.20 

0.0092 

0.20 

0.0646 

0.30 

0.0029 

0.30 

0.0065 

0.30 

0.0433 

0.40 

0.0032 

0.40 

0.0050 

0.40 

0.0283 

0.50 

0.0034 

0.50 

0.0050 

0.50 

0.0215 

OCR  RATE 

(CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

22.00 

12.00 

12.00 

CPU  RATE: 

NOTE:  Internal  database  includes  digits  and  upper  case  letters  from  NSDBl.  NOTE:  Suggested 
that  NIST  be  involved  in  proctoring  future  tests. 


SYSTEM:  NYNEX 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 
none 
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ERROR  RATE  (%) 


NYNEX  — DIGITS  UPPERS  LOWERS 


REJECTION  RATE  (%) 


Figure  181;  Error  rate  versus  rejection  rate  for  NYNEX 


NVMEX 


Figure  182:  Error  rate  per  writer  of  NYNEX 


279 


NVNEXMarr.CORnELATE 


SYSTEM  NUU8CR 

Figure  183:  NYNEX  - digit  correlation 


System  Number 

System  Name 

Correlation  ( ail ) 

Correlation  (correct) 

1 

NYNEX 

1 0000 

1 0000 

2 

REFERENCE 

0.9668 

0 9668 

3 

OCRSYS 

0.9664 

0 9601 

4 

VOTEJkl 

0 9660 

0 9466 

b 

IBM 

0 9470 

0 9371 

6 

ATTa 

0 9462 

0.9380 

7 

ATTJ 

0 9469 

0,9360 

9 

AEG 

0 9464 

0.9363 

9 

VOTEJ» 

0 9460 

0 9379 

10 

ELSAGB-3 

0 9443 

0 9363 

1 1 

ELSAGB J 

0 9440 

0.9360 

12 

KODAK_2 

0 9429 

0 9323 

13 

ATTa 

0.9422 

0.9321 

14 

ERIMU 

0.9421 

0 9331 

13 

THINKS 

0.9419 

0 9328 

16 

ERIM  J 

0 9417 

0 9326 

17 

REI 

0 9406 

0.9323 

18 

NESTOR 

0 9388 

0 9286 

19 

KODAKJ 

0.9374 

0 9266 

20 

UBOL 

0 9366 

0 9282 

21 

THINK.l 

0 9360 

0 9247 

22 

HUGHES. 1 

0 9349 

0 9260 

23 

SYMBUS 

0.9348 

0 9266 

24 

HUGHES. 2 

0.9344 

0 9247 

26 

NIST.4 

0 9337 

0 9238 

26 

ATT  J 

0 9336 

0 9261 

27 

ELSAGB.l 

0 9329 

0.9228 

28 

COMCOM 

0 9266 

0,9219 

29 

GTESS.l 

0 9179 

0.9093 

30 

GTESS  J 

0 9169 

0 9074 

31 

NIST.l 

0.9087 

0.8988 

32 

GMD.3 

0,9047 

0 8944 

33 

MIME 

0 9020 

0 8914 

34 

UPENN 

0 9002 

0 8880 

36 

ASOL 

0 9000 

0 8889 

36 

GMD.l 

0 6982 

0 8883 

37 

NISTJ 

0 8966 

0 8869 

38 

NISTJ 

0 8906 

0.8809 

39 

RISC 

0 8867 

0 8741 

40 

GMD.4 

0 8840 

0.8743 

41 

KAMAN.l 

0 8786 

0 8660 

42 

KAMANJJ 

0 8614 

0 8496 

43 

KAMAN.2 

0-8676 

0.8468 

44 

KAMAN J 

0 8414 

0 8303 

46 

GMD.2 

0 8389 

0-8277 

46 

VALENJ 

0.8346 

0,8237 

47 

IFAX 

0.8262 

0 8127 

48 

VALEN.1 

0 8170 

0 8042 

49 

KAMAN.4 

0 7853 

0 7760 

Table  113:  NYNEX  correlation  graph  key  for  digits. 
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MYNBLUPfCaCORRELATE 


SVaTEM  NUMBER 


Figure  184:  NYNEX  - upper  ceise  correlation 


Syaiem  Number 

System  Name 

Correlation  (all) 

Correlation  (correct) 

1 

NYNEX 

1.0000 

1.0000 

2 

VOTE-M 

0 9633 

0 9393 

3 

REFERENCE 

0 9609 

0 9609 

4 

AEG 

0 9440 

0 9312 

5 

ATT.4 

0 9371 

0 9227 

6 

UMICH.1 

0 9311 

0 9190 

7 

ATTJ 

0 9307 

0 9166 

8 

ERIM-1 

0 9306 

0 9182 

9 

VOTEJ> 

0.9264 

0 9163 

10 

NESTOR 

0 9262 

0.9126 

11 

IBM 

0.9231 

0 9094 

12 

UBOL 

0.9226 

0,9091 

13 

ATT  J 

0.9209 

0 9069 

14 

KODAKU 

0 9203 

0.9060 

13 

ATT  J 

0.9186 

0.9062 

16 

HUGHES-1 

0 9182 

0.9060 

17 

HUGHES. 2 

0 9169 

0 9042 

18 

SYMBUS 

0 9137 

0.9008 

19 

OCRSYS 

0 9104 

0 8989 

20 

GTESSU 

0 9079 

0 8942 

21 

GTESS.2 

0 9066 

0 8928 

22 

MIME 

0.8864 

0 8734 

23 

NIST.4 

0 8829 

0 8704 

24 

ASOL 

0 8787 

0.8647 

26 

REI 

0 8704 

0 8686 

26 

RISO 

0 8603 

0 8369 

27 

NIST.l 

0.8498 

0 8376 

28 

GMD.l 

0 8478 

0.8369 

29 

GMD-J 

0-8467 

0 8343 

30 

KAMAN.l 

0 8400 

0 8272 

31 

GMD.4 

0 8288 

0.8174 

32 

COMCOM 

0 8180 

0 8113 

33 

NISTJI 

0.8116 

0 8048 

34 

KAMAN.3 

0 7949 

0,7827 

36 

IFAX 

0.7947 

0.7828 

36 

KAMAN^ 

0.7866 

0 7731 

37 

NIST.2 

0 7679 

0 7486 

38 

VALEN-1 

0 7496 

0.7376 

39 

GMD.2 

0 7467 

0.7344 

40 

KAMAN.4 

0.721i 

0.7094 

41 

KAMAN^ 

0.6661 

0-6460 

42 

UMICH.2 

0 0407 

0 0237 

Table  114:  NYNEX  correlation  graph  key  for  uppers. 
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NVNEXLOWERCOfWELATE 


8VSTQI  NUMBER 


Figure  185:  NYNEX  - lower  case  correlation 


Sy«i«m  Number 

System  N&me 

Correlation  { ail ) 

Correlation  (correct) 

1 

nVNEX 

1 0000 

1 0000 

7 

VOTE^ 

0 8806 

0 8339 

3 

REFERENCE 

0 8697 

0.8697 

4 

ERIM.1 

0 8429 

0 7986 

A 

ATT. 4 

0.8396 

0.7928 

6 

ATTJ 

0 8378 

0 7939 

7 

AEG 

0 8373 

0 7979 

8 

KODAKU 

0 8320 

0.7888 

9 

ATTU 

0.8318 

0 7912 

10 

IBM 

0 8296 

0 7837 

11 

NESTOR 

0,8276 

0 7837 

12 

OCRSYS 

0 8232 

0 7866 

13 

HUGHES. 1 

0.8209 

0.7796 

14 

ATT  J 

0 8197 

0.7768 

15 

HUGHES.2 

0 il72 

0,7768 

16 

UBOL 

0.8172 

0.7761 

17 

VOTEJ> 

0 8162 

0 7»»7 

18 

UMICH.l 

0 8121 

0 7767 

19 

GTESS.l 

0 8016 

0 7619 

20 

GTESSJ 

0 8000 

0-7670 

21 

NISTU 

0 7897 

0.7498 

22 

NIST.4 

0 7843 

0.7396 

23 

ASOL 

0 7773 

0.7328 

24 

GMD.3 

0.7728 

0.7327 

25 

RISO 

0.7677 

0.7268 

26 

GMD.4 

0 7682 

0.7182 

27 

GMD.l 

0.7682 

0.7182 

28 

NISTJI 

0.7443 

0 7268 

29 

GMD.2 

0.7036 

0 8871 

30 

KAMAN.l 

0.6809 

0.6431 

31 

VALEN.l 

0.6728 

0 6369 

32 

NISTJ 

0 6707 

0 6391 

33 

KAMANJl 

0 6566 

0 6209 

34 

KAMAN.2 

0.6398 

0.6069 

35 

KAMANJ 

0 5693 

0.6378 

36 

KAMAN.4 

0 6311 

0 6022 

37 

COMCOM 

0 5074 

0 4937 

38 

UMICHJ 

0 1028 

0 0606 

Table  115:  NYNEX  correlation  graph  key  for  lowers. 
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SYSTEM:  OCRSYS 


PARTICIPANT:  Harry  S.  Gierhart 

ORGANIZATION:  OCR  Systems,  Inc.,  Huntingdon  Valley,  PA 
FEATURES:  convolution  with  hand-coded  filters 

CLASSIFICATION:  MLP , top  three  choices  ajid  confidence  value  that 
discriminates  between  them  are  calculated. 


HARDWARE : 


TRAINING:  DIGITS  UPPERS  LOWERS  DATABASE 


number  used  is  proprietary  INTERNAL 


STATUS:  on  time 


RESULTS:  — DIGITS  --  — UPPERS  — 


— LOWERS  — DATABASE 


REJ. 

ERR. 

REJ. 

ERR. 

REJ. 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE 

RATE— 

RATE 

RATE— 

0.00 

0.0156 

0.00 

0.0573 

0.00 

0.1370 

0.10 

0.0150 

0.10 

0 . 0224 

0.10 

0 . 1042 

0.20 

0.0166 

0.20 

0.0144 

0.20 

0 . 0800 

0.30 

0.0188 

0.30 

0.0123 

0.30 

0.0636 

0.40 

0.0219 

0.40 

0.0106 

0.40 

0.0586 

0.50 

0.0262 

0.50 

0.0080 

0.50 

0.0528 

OCR  RATE  (CPS) : 

DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

CPU  RATE: 

300.00 

220 . 00 

220. 0( 

NOTE;  Internal  database  is  very  large. 

NOTE;  HYP  files  for  upper  case  letters  included  letters  classified  cis  lower  case  letters.  These  were 
scored  as  incorrect  for  Conference  giving  a zero  rejection  rate  score  of  0.0738.  The  score  given 
above  for  UPPERS  is  case  insensitive. 

NOTE:  Used  a beta  test  version  of  an  off-the-shelf  system  for  this  submission. 

NOTE;  Recently  purchased  by  Adobe  Systems. 
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SYSTEM:  OCRSYS 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 
none 
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ERROR  RATE  (%) 


OCRSYS  — DIGITS  UPPERS  LOWERS 


REJECTION  RATE  (%) 


Figure  186:  Error  rate  versus  rejection  rate  for  OCRSYS 


OCRSVS 


Figure  187:  Error  rate  per  writer  of  OCRSYS 
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OCfWV&OtGn-fOfWELATE 


SYSTEM  NUMBER 

Figure  188:  OCRSYS  - digit  correlation 


System  Number 

Syilem  Name 

Correlation  ( ail ) 

Correlation  (correct) 

1 

OCRSYS 

1 0000 

1 0000 

2 

REFERENCE 

0.9844 

0 9844 

3 

VOTEJM 

0 9746 

0 9681 

4 

ATTJ 

0 9653 

0 9601 

6 

AEG 

0 9652 

0 9586 

6 

IBM 

0 9641 

0 9577 

7 

ELSAGB.2 

0 9636 

0 9583 

S 

ELSAGB J 

0.9632 

0.9579 

9 

VOTEJ> 

0 9629 

0 9579 

10 

ATTJ 

0.9627 

0 9563 

ll 

ERlM.l 

0 9619 

0 9550 

12 

ERIM^ 

0 9608 

0 9542 

13 

THINKJ 

0 9591 

0 9537 

14 

ATT.4 

0.9585 

0 9520 

15 

KODAKS 

0 9584 

0 9521 

16 

REI 

0 9581 

0 9528 

17 

NYNEX 

0.9564 

0 9501 

16 

UBOL 

0 9552 

0 9492 

19 

NESTOR 

0 9552 

0 9483 

20 

KODAK J 

0.9522 

0.9458 

21 

HUGHES. 1 

0 9520 

0.9453 

22 

HUGHES.2 

0 9519 

0.9451 

23 

SYMBUS 

0 9512 

0 9456 

24 

ATT  J 

0 9506 

0 9447 

25 

NIST.4 

0.9499 

0 9435 

26 

THINK.l 

0 9492 

0 9435 

27 

ELSAGB.l 

0 9485 

0 9425 

26 

COMCOM 

0.9471 

0 9445 

29 

GTESS.l 

0.9336 

0 9277 

30 

GTESS J 

0 9320 

0 9261 

31 

NIST.l 

0 9211 

0 9156 

32 

GMD.3 

0 9181 

0 9122 

33 

MIME 

0 9138 

0 9079 

34 

GMD.l 

0 9114 

0 9060 

35 

ASOL 

0 9109 

0 9050 

36 

UPENN 

0 9094 

0 9031 

37 

NISTJJ 

0.9074 

0 9016 

38 

NISTJ 

0 9017 

0 8963 

39 

GMD.4 

0 8971 

0.8918 

40 

RISO 

0 8948 

0 6884 

41 

KAMAN.l 

0.6864 

0 8800 

42 

KAMAN.3 

0.8697 

0 8633 

43 

KAMAN.2 

0 6671 

0 6608 

44 

KAMANJ 

0 8489 

0.8431 

45 

GMDJ 

0.8457 

0 8402 

46 

VALEN.2 

0 8419 

0 8366 

47 

IFAX 

0 8298 

0 8241 

46 

VALEN.l 

0 6222 

0 8159 

49 

KAMAN.4 

0.7933 

0.7875 

Table  116:  OCRSYS  correlation  graph  key  for  digits. 
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OCmY«.UPPERCOMIELATE 


SYSTOI  NUMBER 

Figure  189:  OCRSYS  - upper  Ccise  correlation 


System  Number 

System  N^me 

Correlation  (ail) 

Correlation  (correct) 

OCRSYS 

1 0000 

1.0000 

2 

REFERENCE 

0 9262 

0.9262 

3 

VOTE^ 

0.9269 

0 9147 

4 

AEG 

0.9206 

0.9091 

& 

NYNEX 

0.9104 

0.8989 

fl 

ATT-l 

0 9091 

0.8983 

7 

UMICH.l 

0 9088 

0 8968 

8 

ERIM-1 

0 9080 

0 8972 

9 

NESTOR 

0.9036 

0.8917 

10 

VOTE_P 

0 9019 

0 8940 

1 1 

ATTJ 

0.9010 

0.8906 

12 

IBM 

0 8982 

0.8866 

13 

UBOL 

0 8972 

0 8861 

14 

HUGHES.l 

0.8967 

0 8848 

16 

HUGHES.2 

0 8962 

0.8829 

16 

ATTJ 

0 8943 

0.8827 

17 

ATTJ 

0.8938 

0 8832 

IS 

KODAKJ 

0.8926 

0 8820 

19 

SYMBUS 

0 8901 

0.8792 

20 

GTESS.l 

0 8868 

0.8737 

21 

GTESS-2 

0 8839 

0 8723 

22 

MIME 

0 8627 

0.8621 

23 

NIST_4 

0-8687 

0.8476 

24 

ASOL 

0.8666 

0.8441 

26 

REI 

0 8623 

0 8404 

26 

RISC 

0 8298 

0 8179 

27 

NIST.l 

0.8277 

0 8166 

28 

GMD.l 

0 8267 

0.8163 

29 

GMD.3 

0 8269 

0.8160 

30 

KAMAN.l 

0 8180 

0 8070 

31 

GMD.4 

0 8090 

0 7988 

32 

COMCOM 

0.7987 

0 7927 

33 

NISTJ 

0.7908 

0.7863 

34 

IFAX 

0 7768 

0.7642 

36 

KAMANJ 

0.7766 

0.7646 

36 

KAMAN-2 

0 7672 

0 7666 

37 

NISTJ 

0 7396 

0.7311 

38 

VALEN-l 

0 7334 

0.7216 

39 

GMD-2 

0.7287 

0.7187 

40 

KAMAN.4 

0.7047 

0.6946 

41 

KAMANJ 

0.6417 

0.6314 

42 

UMICH-2 

0.0368 

0.0217 

Table  117:  OCRSYS  correlation  graph  key  for  uppers. 
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OCmvaXOWERCOIMELATE 


SYSTEM  NUUBER 

Figure  190:  OCRSYS  - lower  caise  correlation 


Sy«t«m  Number 

Sy«iem  Name 

Correlation  ( ail ) 

Correlation  (correct) 

1 

oCrsVS 

1 0000 

1 0000 

2 

VOTE^ 

0 8745 

0.8327 

REFERENCE 

0 8630 

0 8630 

4 

AEG 

0 8508 

0 8045 

& 

UMICH.1 

0 84  72 

0 7941 

6 

ERIM-1 

0 8417 

0 7969 

7 

IBM 

0 8342 

0 7863 

8 

ATTJ 

0.8325 

0 7908 

9 

UBOL 

0 8320 

0.7828 

10 

ATTJ 

0 8291 

0 7893 

11 

HUGHES. 1 

0 8267 

0 7810 

12 

HUGHES.2 

0 8252 

0 7794 

u 

NYNEX 

0 8232 

0.7866 

14 

ATT.l 

0.8219 

0.7849 

15 

NESTOR 

0 8205 

0 7809 

16 

ATTJ 

0 8194 

0 7744 

17 

KODAKJ 

0.8184 

0 7811 

18 

GTESS.1 

0.8083 

0.7641 

19 

VOTEJ» 

0 8068 

0 7822 

20 

NIST.4 

0 7928 

0.7431 

21 

RISC 

0.7908 

0.7379 

22 

GTESSJ 

0 7843 

0.7483 

23 

NIST.l 

0 7809 

0 7448 

24 

GM0.3 

0 7613 

0.7256 

25 

ASOL 

0.7600 

0.7242 

26 

GMD.4 

0.7473 

0 7110 

27 

GMD.l 

0.7473 

0.7110 

28 

NISTJ 

0 727g 

0 7138 

29 

GMD  J 

0 7021 

0 6642 

30 

VALEN.1 

0 8817 

0 6418 

31 

KAMAN.l 

0 8744 

0 6381 

32 

NISTJ 

0.8722 

0 6364 

33 

KAMANJ 

0 6529 

0 6169 

34 

KAMANJ 

0 6362 

0.6022 

35 

KAMANJ 

0 5707 

0 5365 

36 

KAMAN.4 

0.5228 

0.4985 

37 

COMCOM 

0 4980 

0.4882 

38 

UMICH-2 

0 0827 

0 0451 

Table  118:  OCRSYS  correlation  graph  key  for  lowers. 
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SYSTEM:  REI 


PARTICIPANT:  David  L.  Cauthron 


ORGANIZATION:  Recognition  Equipment  Inc.  (REI), 

FEATURES:  model-based 
CLASSIFICATION:  MLP 

HARDWARE:  VAX  simulation  of  386  with  coprocessor  boards  ICR 

handprint  recognizer 


TRAINING: 

DIGITS 

UPPERS 

LOWERS  DATABASE 

245000 

100000 

NA 

INTERNAL 

STATUS : 

on  time 

RESULTS:  — DIGITS  — 

— UPPERS  — — LOWERS  — 

DATABASE 

REJ. 

ERR. 

REJ. 

ERR.  REJ. 

ERR. 

TESTDATAl 

RATE 

RATE— 

RATE 

RATE—  RATE 

RATE— 

0.00 

0.0401 

0.00 

0.1174 

0.14 

0.0055 

0.57 

0.0117 

0.10 

0.0088 

0.40 

0.0244 

0.07 

0.0139 

0.24 

0.0379 

0.04 

0.0194 

0.15 

0.0582 

OCR  RATE  (CPS) 

: DIGITS 

UPPERS 

•LOWERS 

SYS  RATE: 

1.97 

2.06 

CPU  RATE: 

NOTE:  Internal  database  contains  approximately  245000  digits  and  100000  upper  case  letters. 
NOTE:  Few  details  of  system  description  provided.  Did  not  train  on  NIST  data. 
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SYSTEM:  REI 
BIBLIOGILA.PHY: 

The  following  references  have  been  provided  for  this  system: 
none 
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ERROR  RATE  (%) 


REI  — DIGITS  UPPERS 


REJECTION  RATE  (%) 


Figure  191:  Error  rate  versus  rejection  rate  for  REI 
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Figure  192:  Error  rate  per  writer  of  REI 
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NELnQIT.CORflELATE 


SY8TE1I  NUMBER 


Figure  193:  REI  - digit  correlation 


System  Number 

Sy5lem  N&me 

Correlation  ( all ) 

Correlation  (correct) 

1 

REI 

1 0000 

1 0000 

2 

REFERENCE 

0.9599 

0.9599 

3 

OCRSYS 

0.9581 

0 9528 

4 

VOTE^ 

0 9561 

0 9479 

5 

IBM 

0 9489 

0 9400 

6 

ATT-I 

0.9467 

0 9400 

7 

AEG 

0.9466 

0 9389 

S 

VOTE_P 

0 9460 

0.9401 

9 

ATTJ2 

0 9459 

0 9377 

10 

ELSAGB^ 

0 9455 

0 9387 

11 

ELSAGB^ 

0.9451 

0 9383 

12 

ERIM  J 

0 9439 

0 9357 

13 

ERIM.l 

0 9436 

0 9355 

14 

THINKS 

0 9434 

0 9355 

13 

KODAK-2 

0.9424 

0 9338 

16 

ATT-4 

0.9417 

0 9335 

17 

NYNEX 

0 9406 

0 9323 

18 

NESTOR 

0 9386 

0 9303 

19 

UBOL 

0 9379 

0 9307 

20 

HUGHES.! 

0 9377 

0 9282 

21 

HUGHES. 2 

0 9372 

0 9278 

22 

KODAK J 

0 9363 

0 9276 

23 

SYMBUS 

0 9357 

0 9280 

24 

THINK.l 

0 9339 

0 9263 

25 

ATT  J 

0 9337 

0 9268 

26 

NIST.4 

0 9334 

0 9253 

27 

ELSAGB.l 

0 9334 

0 9249 

28 

COMCOM 

0 9285 

0 9251 

29 

GTESS.l 

0 9188 

0 9109 

30 

GTESSJ! 

0.9169 

0 9092 

31 

NIST.l 

0.9078 

0.8999 

32 

GMD.3 

0 9046 

0 8965 

33 

UPENN 

0 8998 

0 8896 

34 

MIME 

0 8997 

0.8920 

35 

GMD.l 

0 8984 

0 8906 

36 

ASOL 

0 8972 

0 8893 

37 

NIST.2 

0 8952 

0 8870 

38 

NISTJ 

0 8887 

0 8812 

39 

GMD.4 

0 8844 

0 8766 

40 

RISO 

0.8836 

0 8745 

41 

KAMAN.l 

0 8743 

0 8655 

42 

KAMANJ 

0 8596 

0 8503 

43 

KAMAN J 

0.8564 

0 8476 

44 

K AMAN.S 

0.8389 

0 8305 

45 

GMD.2 

0 8362 

0 8280 

46 

VALEN J 

0 8332 

0 8241 

47 

IFAX 

0.8221 

0.8128 

48 

VALEN.l 

0.8122 

0 8032 

49 

KAMAN.4 

0.7832 

0 7755 

Table  119:  REI  correlation  graph  key  for  digits. 
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na-UPKRCORRELATE 


3VSTE1I  NUMBER 

Figure  194;  REI  - upper  case  correlation 


System  Number 

System  N*me 

Correlation  ( all ) 

Correlation  (correct) 

1 

REI 

1 0000 

1.0000 

2 

VOTEJ»l 

0 8836 

0 8715 

3 

REFERENCE 

0 8826 

0 8826 

4 

AEG 

0 8771 

0.8658 

& 

ATT-4 

0 8726 

0 8589 

6 

NYNEX 

0 8704 

0 8586 

7 

UMICH.l 

0 8694 

0 8561 

8 

ERIM-l 

0 8675 

0.8552 

9 

NESTOR 

0.8665 

0 8527 

10 

ATT-2 

0 8658 

0 8531 

11 

VOTE-P 

0 8634 

0 8556 

12 

IBM 

0 8634 

0 8490 

13 

UBOL 

0 8596 

0 8465 

M 

ATTJ 

0 8596 

0.8456 

13 

HUGHES. 1 

0 8591 

0 8447 

16 

HUGHES.2 

0.8587 

0 8433 

17 

ATTJ 

0,8571 

0 8442 

18 

KODAKJ 

0 8555 

0 8428 

19 

OCRSYS 

0 8523 

0.8404 

20 

SYMBUS 

0 8519 

0 8388 

21 

GTESS.l 

0 8442 

0 8327 

22 

GTESS  J 

0 8426 

0 8312 

23 

MIME 

0 8332 

0.8176 

24 

NIST-4 

0.8328 

0 8153 

25 

ASOL 

0 8225 

0 8092 

26 

RISC 

0 8075 

0 7876 

27 

GMD.l 

0 8037 

0.7870 

28 

GMD.3 

0 8015 

0.7851 

29 

NIST.l 

0 8013 

0 7860 

30 

KAMAN.l 

0 7991 

0 7803 

31 

GMD.4 

0 7834 

0.7688 

32 

NIST.3 

0 7694 

0.7583 

33 

COMCOM 

0 7668 

0 7612 

34 

K AMAN.3 

0 7597 

0 7399 

35 

IFAX 

0 7549 

0 7357 

36 

KAMAN-2 

0 7533 

0 7328 

37 

NISTJ 

0.7232 

0.7083 

38 

VALEN-1 

0 7169 

0 6983 

39 

GMD.2 

0.7154 

0 6975 

40 

KAMAN.4 

0 6958 

0 6750 

41 

KAMAN^ 

0 6303 

0 6121 

42 

UMICHJ 

0 0550 

0 0204 

Table  120:  REI  correlation  graph  key  for  uppers. 
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No  Data  Available 


Figure  195:  REI  - lower  case  correlation 


There  wa«  no  d4t4  for  thi«  ev4lu4tion. 


Table  121:  REI  correlation  graph  key  for  lowers. 
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SYSTEM:  RISO 


PARTICIPANT:  Christian  Liisberg 

ORGANIZATION:  Riso  National  Laboratories,  Roskilde,  Denmark 


PREPROCESSING:  size  normalization  to  16x16,  no  deskewing. 

The  normalized  image  is  directly  input  to  the  neural  net . 

FEATURES:  receptor  field  (LVT) 

CLASSIFICATION:  self -organizing  geometric,  ensembles  of  look-up  table 
networks  used. 


HARDWARE : 

33  MHz  486 

with  16  Mbyte  RAM 

TRAINING: 

DIGITS 

UPPERS 

LOWERS 

DATABASE 

210000 

40000 

40000 

NSDB3 

1 to  2ambiguous  characters  removed  by  hauid 


STATUS:  on  time 


RESULTS:  — DIGITS  — 

— UPPERS  -- 

— LOWERS  — 

DATABASE 

REJ. 

ERR. 

REJ. 

ERR. 

REJ. 

ERR. 

TESTDATAl 

RATE 

RATE— 

RATE 

RATE-- 

RATE 

RATE— 

0.00 

0. 1055 

0.00 

0.1414 

0.00 

0.2172 

0.02 

0.0975 

0.03 

0.1244 

0.04 

0.1979 

0.06 

0.0759 

0.10 

0.0943 

0.13 

0.1580 

0.10 

0.0594 

0.16 

0.0694 

0.21 

0.1273 

0.14 

0 . 0460 

0.22 

0.0504 

0.28 

0.1013 

0.18 

0.0345 

0.29 

0.0347 

0.36 

0.0790 

0.23 

0.0241 

0.38 

0.0221 

0.45 

0.0578 

0.30 

0.0141 

0.55 

0.0105 

0.61 

0.0288 

OCR  RATE  (CPS) : 

DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

4.67 

2.00 

7 

CPU  RATE: 

6.79 

2.31 

7 
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SYSTEM:  RISC 

PARTICIPANT:  ChristieLn  Liisberg 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 
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ERROR  RATE  (») 


RISO  --  DIGITS  UPPERS  LOWERS 


REJECTION  RATE  (%) 


Figure  196:  Error  rate  versus  rejection  rate  for  RISO 
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Figure  197:  Error  rate  per  writer  of  RISO 
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ia8aOiafT.COIIKLATC 


SYSTEM  NUMBER 


Figure  198:  RISO  - digit  correlation 


System  Number 

System  Name 

Correlation  ( all ) 

Correlation  (correct) 

RTTo 

1.0000 

1.0000 

2 

VOTEJ< 

0 9091 

0 8916 

3 

THINK.l 

0 9060 

0.8812 

4 

ATT.4 

0 9060 

0 8838 

5 

KODAK J 

0 9017 

0 8821 

« 

ATT  J 

0 9015 

0 8839 

7 

VOTE_P 

0 901 1 

0.8889 

8 

NIST.4 

0 9008 

0 8777 

9 

SYMBUS 

0 9002 

0 8794 

10 

AEG 

0 8990 

0 6828 

1 1 

ERIMU 

0 8990 

0.8818 

12 

KODAKJ 

0 8985 

0 8785 

L3 

ERIM-2 

0.8974 

0 8812 

14 

ATTJ 

0 8959 

0 8827 

lb 

ELSAGB J 

0 8958 

0.8823 

16 

IBM 

0 8957 

0 8816 

17 

ELSAGB^ 

0 8955 

0 8820 

18 

OCRSYS 

0 8948 

0 8884 

19 

REFERENCE 

0 8945 

0 8945 

20 

UBOL 

0 8941 

0 8771 

21 

ATTJ 

0 8936 

0.8763 

22 

NESTOR 

0.8935 

0 8772 

23 

GTESS  J 

0 8930 

0 8669 

24 

GTESS.l 

0 8929 

0 8677 

25 

NIST.l 

0 8926 

0 8618 

26 

ELS  AGB.l 

0 8912 

0 8730 

27 

NISTJ 

0 8909 

0 8561 

28 

GMD.S 

0 8899 

0.8599 

29 

HUGHES. 1 

0 8890 

0 8732 

30 

MIME 

0.8682 

0 8571 

31 

NIST  J 

0 8882 

0 8526 

32 

HUGHES. 2 

0 8875 

0 8725 

33 

THINKJ2 

0 8869 

0 8759 

34 

NYNEX 

0 8867 

0 8741 

35 

ASOL 

0 8867 

0 8551 

36 

REI 

0 8836 

0 8745 

37 

KAMAN.l 

0 8828 

0.8426 

38 

GMD-1 

0 8800 

0 8526 

39 

UPENN 

0 8739 

0 8470 

40 

GMD.4 

0 8648 

0 8382 

41 

COMCOM 

0 8633 

0 8607 

42 

KAMAN.3 

0 8625 

0 8258 

43 

KAMAN.2 

0 8605 

0 8233 

44 

GMDJ 

0.8525 

0 8110 

45 

KAMAN.A 

0.8359 

0 8035 

46 

VALEN.1 

0 8068 

0.7739 

47 

IFAX 

0 8064 

0.7802 

48 

VALENJ 

0 7994 

0 7832 

49 

KAMAN.4 

0 7974 

0.7582 

Table  122:  RISO  correlation  graph  key  for  digits. 
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SYSTEM  NUMBER 


Figure  199;  RISO  - upper  case  correlation 


System  Number 

System  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

RISO 

1.0000 

1.0000 

2 

VOTE-M 

0 8761 

0.8554 

3 

ATT.4 

0 8760 

0 8488 

4 

SYMBUS 

0.86S3 

0.8353 

5 

KODAKJ 

0 8625 

0.8359 

6 

ATTJ 

0.8619 

0 8406 

7 

VOTEJ> 

0 8614 

0 8496 

8 

AEG 

0.8587 

0.8446 

9 

REFERENCE 

0.8586 

0 8586 

10 

MIME 

0 8585 

0.8210 

11 

ERIM.l 

0 8562 

0 8390 

12 

UBOL 

0 8557 

0.8337 

13 

UMICHU 

0.8552 

0.8383 

14 

NESTOR 

0 8545 

0.8361 

IS 

ATTJ 

0 8537 

0 8321 

16 

IBM 

0.8528 

0 8331 

17 

NYNEX 

0 8503 

0 8369 

18 

ATTJ 

0 8484 

0 8296 

19 

GTESS-1 

0 8449 

0 8219 

20 

HUGHES. 1 

0 8443 

0 8263 

21 

HUGHES. 2 

0.8425 

0 8245 

22 

GTESS  J 

0 8425 

0 8204 

23 

ASOL 

0 8409 

0 8085 

24 

NIST.4 

0 8405 

0 8096 

2S 

NIST.1 

0 8338 

0 7925 

26 

OCRSYS 

0 8298 

0 8179 

27 

GMD.l 

0 8241 

0 7899 

28 

GMD-3 

0.8220 

0 7875 

29 

KAMAN.l 

0 8153 

0 7803 

30 

REI 

0 8075 

0 7876 

31 

NISTJ 

0.8062 

0,7692 

32 

GMD.4 

0 8009 

0,7697 

33 

KAMANJ 

0.7835 

0.7449 

34 

KAMAN.2 

0 7792 

0 7381 

3S 

NISTJ 

0 7765 

0.7253 

36 

GMD.2 

0 7710 

0 7165 

37 

IFAX 

0 7479 

0 7241 

38 

COMCOM 

0 7469 

0.7409 

39 

VALEN.l 

0 7320 

0.6968 

40 

KAMAN.4 

0.7272 

0 6826 

41 

KAMAN^ 

0.6464 

0.6143 

42 

UMICH  J 

0 0641 

0 0161 

Table  123:  RISO  correlation  graph  key  for  uppers. 
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svarrEMNUMBcn 

Figure  200:  RISO  - lower  case  correlation 


Sy«iem  Number 

Sy*iem  N»me 

Correlation  ( all ) 

Correlation  (correct) 

1 

RTso 

1 0000 

1 0000 

2 

VOTEJH 

0.8227 

0 7683 

3 

UMICH.1 

0.7968 

0 7369 

4 

ATT.4 

0.7947 

0 7391 

b 

OCRSYS 

0 7908 

0.7379 

6 

ATTJ 

0 7902 

0 7393 

7 

ATT  J 

0.7869 

0.7289 

a 

UBOL 

0 7813 

0 7269 

9 

ERIM-1 

0.7838 

0.7368 

10 

REFERENCE 

0 7828 

0 7828 

L 1 

AEG 

0 7817 

0 7360 

12 

IBM 

0.7811 

0.7272 

13 

NIST.l 

0 7784 

0 7145 

14 

KOOAKU 

0 7761 

0-7279 

lb 

VOTEJ* 

0,7732 

0 7460 

16 

ATTU 

0 7724 

0 7281 

17 

NESTOR 

0.7694 

0 7243 

la 

GTESS.l 

0 7690 

0 7131 

19 

NIST.l 

0 7678 

0 7017 

20 

NYNEX 

0 7677 

0.7268 

21 

HUGHES-1 

0,7647 

0 7173 

77 

HUGHES.2 

0.7602 

0 7145 

23 

GTESS.2 

0 7638 

0 7039 

24 

ASOL 

0 7493 

0 6898 

26 

GMD.3 

0 7149 

0 6917 

26 

NISTJ 

0 7418 

0 6964 

27 

GMD.l 

0 7286 

0 6766 

28 

GMD.l 

0 728S 

0 6765 

29 

GMD.2 

0.7266 

0.6607 

30 

NIST  J 

0 7193 

0.6332 

31 

KAMAN.l 

0 6821 

0 6172 

32 

VALEN.l 

0 6790 

0 6148 

33 

KAMAN-3 

0.6693 

0 6964 

34 

K AMAN^ 

0 6472 

0 6833 

36 

KAMAN-S 

0 6667 

0.6168 

36 

KAMAN.l 

0 6463 

0.4889 

37 

COMCOM 

0 4627 

0 4640 

38 

UMICH  J 

0 0906 

0 0336 

Table  124:  RISO  correlation  graph  key  for  lowers. 
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SYSTEM:  SYMBUS 


PARTICIPANT:  Jerry  Fisher 

ORGANIZATION:  Symbus  Technology,  Brookline,  MA 
FEATURES:  output  of  preprocessing 
CLASSIFICATION:  cascaded  self-organizing  NNs 
HARDWARE : 


TRAINING:  DIGITS  UPPERS 


LOWERS  DATABASE 


number  used  is  proprietary  NA 


INTERNAL 


STATUS:  on  time,  three  RJX  files  missing 

RESULTS:  — DIGITS  --  — UPPERS  --  --  LOWERS  — DATABASE 


REJ. 

ERR. 

REJ. 

ERR . RE J . 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE 

RATE—  RATE 

RATE- 

0.00 

0.0471 

0.00 

0.0729 

0.00 

0.0470 

0.00 

0 . 0727 

0.02 

0.0397 

0.07 

0 . 0462 

0.04 

0.0327 

0.15 

0.0289 

0.11 

0.0194 

0.28 

0.0151 

0.19 

0.0111 

OCR  RATE  (CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

NA 

NA 

NA 

CPU  RATE: 


NOTE;  Some  of  the  HYP  files  contained  tildes  to  indicate  that  no  classification  was  attempted. 
Every  classification  in  the  whole  file,  rather  than  just  the  tilde  was  inadvertently  converted  to  a 
question  mark  at  NIST  before  scoring  for  the  Conference.  This  gave  zero  rejection  rate  error  rates 
of  7.0%  and  12.0%  for  digits  and  uppers,  respectively.  The  scores  above  reflect  the  correction  of 
this  NIST  error. 


NOTE:  Few  if  any  details  provided  about  features  or  recognition  algorithm. 
NOTE:  Internal  database  includes  NSDBl. 
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SYSTEM;  SYMBUS 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  systena: 
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NUMBER  WRITERS  WITH  ERROR  o E 


SYMBUS  — DIGITS  UPPERS 


REJECTION  RATE  (%) 


Figure  201:  Error  rate  versus  rejection  rate  for  SYMBUS 


SVMBU8 


Figure  202:  Error  rate  per  writer  of  SYMBUS 
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SVMBU&OtGIT.COfWELATE 


SYSTEM  NUMBER 

Figure  203:  SYMBUS  - digit  correlation 


System  Number 

System  Name 

Correlation  ( all ) 

Correlation  (correct  ) 

SYMBUS 

1 0000 

1 0000 

2 

VOTEJkl 

0 9611 

0.9468 

3 

AEG 

0 9536 

0 9388 

4 

REFERENCE 

0 9529 

0 9529 

6 

OCRSYS 

0 9512 

0 9456 

6 

VOTEJ* 

0 9512 

0 9411 

7 

ERIM.l 

0 9494 

0.9350 

9 

KODAK.2 

0 9494 

0.9337 

9 

ATTJ 

0 9488 

0.9356 

10 

ATT-4 

0 9486 

0 9333 

11 

ERIM  J 

0 9479 

0 9346 

12 

CLSAGB^ 

0 9467 

0.9356 

13 

attj 

0 9462 

0 9361 

14 

ELSAGB^ 

0 9462 

0 9362 

1& 

IBM 

0.9442 

0 9341 

16 

KODAKJ 

0 9430 

0 9273 

17 

UBOL 

0 9425 

0.9294 

18 

NIST-4 

0 9411 

0 9260 

19 

THINK.1 

0.9407 

0.9259 

20 

NESTOR 

0 9406 

0 9283 

21 

ATTJ 

0 9399 

0.9268 

22 

THINK.2 

0 9389 

0 9297 

23 

ELSAGB.l 

0 9379 

0.9240 

24 

HUGHES. 2 

0 9370 

0.9244 

25 

HUGHES-1 

0 9366 

0.9242 

26 

REI 

0 9357 

0.9280 

27 

NYNEX 

0 9348 

0.9256 

28 

GTESS.l 

0 9243 

0.9106 

29 

GTESS.2 

0 9234 

0.9093 

30 

COMCOM 

0 9197 

0.9169 

31 

NISTU 

0 9157 

0.9006 

32 

GMD.J 

0.9136 

0.8981 

33 

MIME 

0 9079 

0 *927 

34 

ASOL 

0 9076 

0 8915 

35 

NISTJ 

0 9068 

0 8902 

36 

UPENN 

0 9065 

0.8900 

37 

GMD.l 

0 9058 

0.8915 

38 

NISTJ 

0.9017 

0 8850 

39 

RISC 

0 9002 

0 8794 

40 

GMD-4 

0 8904 

0.8768 

41 

KAMAN.l 

0.8873 

0 8693 

42 

KAMAN-3 

0 8698 

0.8528 

43 

KAMAN.2 

0 8685 

0.8512 

44 

GMD.2 

0 8495 

0 8320 

45 

KAMAN.S 

0 8484 

0.8327 

46 

VALEN.2 

0.8341 

0 8223 

47 

IFAX 

0.8286 

0.8134 

48 

VALEN.1 

0.8201 

0 8048 

49 

KAMAN.4 

0.7965 

0.7800 

Table  125;  SYMBUS  correlation  graph  key  for  digits. 
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SYSTEM  NUMBER 

Figure  204:  SYMBUS  - upper  case  correlation 


Syaicm  Number 

System  N*me 

Correi&lion  ( &li ) 

Correlation  (correct) 

1 

SYMBUS 

1 0000 

1.0000 

2 

VOTEJVl 

0 9382 

0.9204 

i 

ATT.4 

0,9306 

0 9081 

4 

REFERENCE 

0 9271 

0 9271 

6 

AEG 

0 9268 

0 9126 

6 

ERIM.l 

0 9200 

0 9029 

7 

KODAK-1 

0.9170 

0.8941 

9 

VOTE_P 

0 916S 

0 9049 

9 

ATT^ 

0.91S8 

0.8986 

10 

NYNEX 

0 9137 

0.9008 

11 

UBOL 

0 9136 

0 8942 

12 

UMICH.l 

0.9133 

0 8993 

13 

NESTOR 

0 9082 

0 8943 

14 

ATTJ 

0 9082 

0.8897 

IS 

IBM 

0 9071 

0.8916 

16 

HUGHES. 1 

0 9070 

0 8893 

17 

HUGHES.2 

0 90S4 

0 8872 

18 

ATTJ 

0 90SI 

0.8887 

19 

GTESSJ 

0 8977 

0 8794 

20 

GTESS-2 

0 896S 

0 8781 

21 

OCRSYS 

0 8901 

0 8792 

22 

MIME 

0 8843 

0 8622 

23 

NIST.4 

0 8764 

0 8667 

24 

ASOL 

0.8761 

0 8646 

2S 

RISO 

0 86S3 

0 8363 

26 

REI 

0 8S19 

0.8388 

27 

NIST.l 

0 8S04 

0 8291 

2S 

GMD.l 

0 8460 

0 8263 

29 

GMD.3 

0 84S3 

0 8240 

30 

KAMAN.l 

0 8361 

0 8176 

31 

GMD.4 

0 8270 

0.8070 

32 

NISTJ 

0 8179 

0 8006 

33 

COMCOM 

0 7996 

0.7941 

34 

KAMANJl 

0 7934 

0 7741 

3S 

IFAX 

0.7877 

0.7706 

36 

KAMAN.2 

0 7862 

0,7669 

37 

NISTJJ 

0.7688 

0 ' 63 

38 

GMD.2 

0-7S79 

0.7i44 

39 

VALEN.l 

0.7S09 

0,7301 

40 

KAMAN.4 

0 7267 

0.7066 

41 

KAMAN.i 

0 6642 

0.6381 

42 

UMICHJ! 

0 0424 

0.0204 

Table  126:  SYMBUS  correlation  graph  key  for  uppers. 
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No  Data  Available 


Figure  205:  SYMBUS  - lower  case  correlation 


There  wa«  no  for  thi«  evaluation 


Table  127:  SYMBUS  correlation  graph  key  for  lowers. 
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SYSTEM:  THINK. 1 


PARTICIPANT;  Stephen  Smith 

ORGANIZATION:  Thinking  Machines  Corporation,  Cambridge,  MA 

PREPROCESSING:  size  normalization 

FEATURES:  template,  model,  including  arcs  extracted  from  a 32x32 
image  after  normalization. 

CLASSIFICATION:  distance  maps,  modified  neairest  neighbor, 

modified  Hamming  distance  used  where  each  pixel  is 
represented  by  its  distance  to  the  nearest 
matching  pixel. 


HARDWARE; 

32,768 

processor  CM2 

with  SUN  front 

end 

TRAINING: 

DIGITS 

UPPERS 

LOWERS  DATABASE 

all 

NA 

NA 

NSDB3 

STATUS ; 

on  time 

RESULTS : 

— DIGITS  — 

— UPPERS  — 

— LOWERS  — 

DATABASE 

RE J . ERR . 

RATE  RATE— 

RE J . ERR . 

RATE  RATE— 

REJ.  ERR. 

RATE  RATE— 

TESTDATAl 

0.00  0.0489 

0.10  0.0152 

0.20  0.0059 

0.30  0.0027 

0.40  0.0014 

0.50  0.0006 

OCR  RATE  (CPS) : DIGITS  UPPERS  LOWERS 

SYS  RATE ; 0.67 

CPU  RATE: 
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SYSTEM:  THINK.l 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 
none 
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NUMBER  WRITERS  WITH  ERROR  <3  E 


100.0 


THINK  1 


DIGITS 


S 


UJ 


REJECTION  RATE  {%) 


Figure  206:  Error  rate  versus  rejection  rate  for  THINK_1 


THMK  1 


Figure  207:  Error  rate  per  writer  of  THINK.l 
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SYSTEM  NUMSCR 

Figure  208:  THINK.l  - digit  correlation 


Sydiem  Number 

System  Name 

Correlation  ( all) 

Correlation  (correct) 

1 

THlNK.l 

1 0000 

1 0000 

2 

VOTE^ 

0 9603 

0.9452 

Z 

REFERENCE 

0 951 1 

0 9511 

4 

ATTJ 

0 9500 

0 9369 

b 

VOTEJ* 

0 9499 

0 9394 

S 

AEG 

0 9496 

0 9356 

7 

OCRSYS 

0 9492 

0 9435 

S 

ELSAGB^ 

0 9486 

0.9356 

9 

ATT.4 

0 9485 

0 9322 

10 

ELSAGB J 

0.9482 

0 9352 

11 

ATTJ 

0 9474 

0 9341 

12 

KODAK J 

0 9466 

0 9312 

13 

NIST.4 

0 9454 

0 9265 

14 

ERIM.1 

0 9453 

0.9316 

lb 

UBOL 

0 9445 

0 9292 

16 

IBM 

0 9440 

0 9328 

17 

ERIM  J 

0 9429 

0 9306 

18 

KODAKJ 

0 9409 

0 9255 

19 

SYMBUS 

0 9407 

0 9259 

20 

THINK-2 

0 9401 

0 9293 

21 

ELSAGB.l 

0 9386 

0 9228 

22 

NESTOR 

0 9384 

0 9261 

23 

ATTJ 

0 9364 

0 9240 

24 

NYNEX 

0 9350 

0 9247 

25 

HUGHES-1 

0.9341 

0 9220 

26 

REI 

0 9339 

0 9263 

27 

HUGHES-2 

0.9336 

0 9217 

28 

NIST.l 

0.9277 

0 9048 

29 

GTESS-1 

0 9273 

0 9107 

30 

GTESS-2 

0 9268 

0 9098 

31 

COMCOM 

0 9171 

0.9145 

32 

GMD.3 

0 9167 

0 8979 

33 

MIME 

0 9150 

0 8962 

34 

ASOL 

0 9115 

0 8922 

35 

NIST-2 

0 9097 

0 8900 

36 

GMD-1 

0 9093 

0 8916 

37 

RISO 

0 9080 

0 8812 

38 

UPENN 

0 904  7 

0.8873 

39 

NISTJ 

0 9037 

0 8845 

40 

GMD-4 

0.8944 

0.8771 

41 

KAMAN.l 

0 8887 

0.8687 

42 

KAMAN.3 

0 8710 

0 8525 

43 

KAMAN-2 

0 8678 

0 8494 

44 

GMD-2 

0 8502 

0.8312 

45 

KAMAN.i 

0 8474 

0.8307 

46 

VALEN-2 

0.8308 

0.8201 

47 

IFAX 

0 8273 

0.8114 

48 

VALEN-1 

0.8178 

0.8015 

49 

KAMAN-4 

0 7962 

0 7780 

Table  128:  THINK_1  correlation  graph  key  for  digits. 
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No  Data  Available 


Figure  209:  THINK_1  - upper  Ccise  correlation 


There  no  data  for  thi*  evaluation. 


Table  129:  THINK.l  correlation  graph  key  for  uppers. 
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No  Data  Available 


Figure  210:  THINK.l  - lower  case  correlation 


There  no  d4t&  for  thi«  ev4lu4tion 


Table  130:  THINK_1  correlation  graph  key  for  lowers. 
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SYSTEM:  UBOL 


PARTICIPANT:  Dr.  Zsolt  M.  Kovacs-V. 

ORGANIZATION:  University  of  Bologna,  Bologna,  Italy 

PREPROCESSING:  noise  removal,  slant  normalization,  thinning, 

and  size  normalization  to  32x32.  Then  a distance 
transform  is  performed  on  the  background  and  a further 
reduction  is  performed  to  8x8.  This  provides 
a 64-dimensional  feature  vector. 


FEATURES:  rule-based  distance  transform 
CLASSIFICATION:  KNN  with  novel  metric 


HARDWARE:  simulation  of  CM2  with  64K  processors  on  SPARC 


TRAINING: 

DIGITS 

UPPERS 

LOWERS  DATABASE 

72000 

all 

all  NSDB3 

STATUS : 

on  time 

RESULTS:  — DIGITS  — 

— UPPERS  — 

— LOWERS  — DATABASE 

REJ. 

ERR. 

REJ. 

ERR. 

REJ. 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE 

RATE— 

RATE 

RATE— 

0.00 

0.0435 

0.00 

0.0624 

0.00 

0 . 1548 

0.04 

0.0271 

0.03 

0.0506 

0.04 

0.1365 

0.06 

0.0215 

0.06 

0.0390 

0.11 

0.1107 

0.07 

0.0184 

0.09 

0.0334 

0.17 

0.0909 

0.09 

0.0148 

0.11 

0.0282 

0.22 

0 . 0745 

0.11 

0.0122 

0.15 

0.0221 

0.25 

0.0655 

0.13 

0.0108 

0.18 

0.0197 

0.28 

0.0564 

0.15 

0.0096 

0.20 

0.0171 

0.33 

0 . 0436 

0.17 

0.0086 

0.25 

0.0130 

0.37 

0.0379 

0.19 

0.0079 

0.31 

0.0105 

0.42 

0.0287 

OCR  RATE  (CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

0.06 

0.09 

0.04 

CPU  RATE: 

0.08 

0.10 

0.05 
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SYSTEM:  UBOL 
BIBLIOGRAPHY; 

The  following  references  have  been  provided  for  this  system: 
[43l[441[45I[46l[47l  [12]16]l48)l49| 
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NUMBER  WRITERS  WITH  ERROR  o E 


UBOL  --  DIGITS  UPPERS  LOWERS 


REJECTION  RATE  (%) 


Figure  211:  Error  rate  versus  rejection  rate  for  UBOL 


UBOL 


Figure  212:  Error  rate  per  writer  of  UBOL 
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UaOLJMQIT.CCMRELATE 


SVaTEM  NUU8CR 


Figure  213:  UBOL  - digit  correlation 


System  Number 

System  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

UBOL 

1 0000 

1 0000 

2 

VOTE31 

0 9644 

0.9496 

3 

AEG 

0 9632 

0 9446 

4 

ELSAGBJS 

0 9666 

0 9420 

b 

REFERENCE 

0.9666 

0 9666 

6 

ELSAGB^ 

0 9661 

0 9416 

7 

OCRSYS 

0 9662 

0 9492 

8 

ERIM.l 

0 9636 

0.9382 

9 

VOTE_P 

0 9636 

0 9429 

10 

ATTa 

0 9611 

0.9403 

ll 

NIST.4 

0 9606 

0 9316 

12 

ATT.4 

0.9488 

0 9348 

13 

ERIMJ 

0,9487 

0 9369 

14 

ATTJ 

0.9486 

0 9368 

IS 

KODAK J 

0.9486 

0.9349 

16 

IBM 

0 9477 

0.9374 

17 

ELS  AGB.l 

0.9466 

0.9287 

18 

THINK.l 

0.9446 

0 9292 

19 

SYMBUS 

0 9426 

0 9294 

20 

KODAKU 

0.9426 

0 9290 

21 

THINK.2 

0 9423 

0.9328 

22 

ATT  J 

0 9417 

0 9289 

23 

NESTOR 

0.9416 

0.9299 

24 

HUGHES. 1 

0 9401 

0 9276 

2S 

HUGHES_2 

0 9396 

0 9272 

26 

REl 

0.9379 

0 9307 

2T 

NYNEX 

0 9366 

0.9282 

28 

GTESS.I 

0.9276 

0.9129 

29 

GTESS J 

0 9266 

0 9116 

30 

COMCOM 

0 9224 

0 9198 

31 

NIST.1 

0.9170 

0 9018 

32 

GMD.3 

0 9166 

0 8999 

33 

GMD.l 

0 9091 

0 8936 

34 

MIME 

0 9081 

0 8941 

3S 

ASOL 

0 9067 

0 8918 

36 

NIST.2 

0.9069 

0 8902 

37 

UPENN 

0 9041 

0 8896 

38 

NIST  J 

0.9016 

0 8867 

39 

GMD.4 

0 8944 

0 8792 

40 

RISC 

0 8941 

0.8771 

41 

KAMAN.l 

0 8867 

0.8691 

42 

KAMAN.3 

0.8720 

0 8642 

43 

KAMAN.2 

0.8698 

0 8619 

44 

KAMAN.A 

0.8467 

0.8321 

4S 

GMD.2 

0 8468 

0 8307 

46 

VALENJ 

0.8366 

0.8261 

47 

IFAX 

0 8293 

0.8143 

48 

VALEN.1 

0.8177 

0 8031 

49 

KAMAN.4 

0 7990 

0 7809 

Table  131:  UBOL  correlation  graph  key  for  digits. 
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U80L.UP9ER.C0mCl-ATE 


SrSTEll  NUMBER 

Figure  214:  UBOL  - upper  case  correlation 


System  Number 

System  Nsme 

Correlation  ( all ) 

Correlation  (correct) 

1 

UboL 

l.OOOO 

l.OOOO 

2 

VOTEJ4 

0 9481 

0.9299 

3 

AEG 

0.9433 

0 9246 

4 

REFERENCE 

0.9376 

0.9376 

b 

ATT.4 

0.9313 

0 9132 

6 

ERIM.l 

0.9290 

0.9116 

7 

UMICH-1 

0.9231 

0.9080 

a 

NYNEX 

0 9223 

0 9091 

9 

VOTEJ* 

0 9213 

0.9104 

10 

ATTJ 

0 9210 

0.9049 

1 1 

NESTOR 

0.9194 

0 9030 

12 

HUGHES. 1 

0 9188 

0 8997 

13 

ATTJ 

0.9186 

0.8998 

14 

ATT  J 

0.9169 

0.8989 

13 

KODAKa 

0.9168 

0 8977 

16 

HUGHES-2 

0 9168 

0 8973 

17 

SYMBUS 

0 9136 

0 8942 

18 

IBM 

0.9126 

0.8979 

19 

GTESS.l 

0 9044 

0 8863 

20 

GTESSa 

0 9031 

0.8849 

21 

OCRSYS 

0 8972 

0 8861 

22 

NIST-4 

0 8941 

0 8692 

23 

MIME 

0.8867 

0.8681 

24 

ASOL 

0 8797 

0 8398 

23 

REI 

0 8396 

0.8463 

26 

NIST-1 

0 8337 

0 8346 

27 

RISO 

0.8337 

0.8337 

28 

GMD-I 

0 8310 

0 8324 

29 

GMD.3 

0 8496 

0 8308 

30 

KAMAN.l 

0 8393 

0.8213 

31 

GMD.4 

0.8323 

0 8144 

32 

NISTJ 

0 8172 

0 8028 

33 

COMCOM 

0.8060 

0.7999 

34 

KAMAN.3 

0 7938 

0-7774 

33 

IFAX 

0 7943 

0 7762 

36 

KAMAN.2 

0 7870 

0 7684 

37 

NISTJ 

0.7633 

0 7463 

38 

VALEN.l 

0-7303 

0 7313 

39 

GMD-2 

0 7488 

0 7311 

40 

KAMAN.4 

0.7262 

0-7070 

41 

KAMAN.S 

0.6331 

0.6386 

42 

UMICHJ 

0.0448 

0.0211 

Table  132:  UBOL  correlation  graph  key  for  uppers. 
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UaOLLOWERCORflELATE 


SYSTEM  NUMBER 

Figure  215:  UBOL  - lower  case  correlation 


System  Number 

Sy«iem  N4me 

Correifttion  ( 4II ) 

Correifttion  (correct) 

1 

UboL 

1.0000 

1.0000 

2 

VOTEJ^ 

0 8784 

0 8238 

AEG 

0.8663 

0 8026 

4 

ERIM.1 

0 *471 

0 7914 

& 

REFERENCE 

0.8462 

0 8462 

6 

UMICHU 

0 8373 

0.7800 

7 

OCRSYS 

0.8320 

0.7828 

8 

ATTJ 

0 8288 

0.7796 

9 

HUGHES. 1 

0 8273 

0 7732 

10 

KODAKU 

0.8267 

0.7778 

LI 

HUGHES.2 

0.8267 

0 7716 

12 

ATTJ 

0.8248 

0.7704 

13 

ATTJ 

0 8188 

0-7763 

14 

IBM 

0 8L84 

0.7706 

16 

ATT.4 

0 8173 

0 7740 

16 

NYNEX 

0 8172 

0 7761 

17 

NIST.4 

0.8146 

0 7449 

IS 

NESTOR 

0 8137 

0 7701 

19 

VOTEJ 

0 8097 

0.7793 

20 

GTESS-1 

0.7966 

0 7604 

21 

NIST.1 

0.7883 

0.7409 

22 

GTESS  J 

0.786L 

0-7410 

23 

RISO 

0,7843 

0.7269 

24 

GMD.3 

0 7809 

0 7283 

26 

GMD.4 

0 7644 

0 7122 

26 

GMD.l 

0.7644 

0 7122 

2T 

ASOL 

0 7634 

0.7189 

28 

NISTJ 

0 7306 

0 7106 

29 

GMD.2 

0 6998 

0 6668 

30 

VALEN.l 

0 6848 

0 6347 

31 

KAMAN.l 

0.6802 

0 6338 

32 

NISTJ 

0-6684 

0 6291 

33 

KAMANJ 

0 6662 

0.6116 

34 

KAMAN-2 

0 6417 

0 6992 

35 

KAMAN.i 

0.6667 

0 6284 

36 

KAMAN.4 

0 6372 

0 6010 

37 

COMCOM 

0.4919 

0 4 798 

38 

UMICHJ 

0 0966 

0.0467 

Table  133:  UBOL  correlation  graph  key  for  lowers. 
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SYSTEM:  UPENN 


PARTICIPANT : Thomas  Fontaine 


ORGANIZATION:  University  of  Pennsylvania,  Philadelphia,  PA 
FEATURES:  local  receptor  fields 

CLASSIFICATION:  Spatio-temporal  connectionist  model.  Learning 

using  a gradient-based  technique.  Shift  invariance 
is  achieved  along  temporalized  directions. 


HARDWARE:  IBM  RS/6000 

TRAINING:  DIGITS  UPPERS  LOWERS  DATABASE 


5400  USPS 

STATUS:  on  time 

RESULTS:  — DIGITS  — — UPPERS  — — LOWERS  — DATABASE 


REJ. 

ERR. 

REJ. 

ERR.  REJ. 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE 

RATE—  RATE 

RATE- 

0.00 

0.0908 

0.10 

0.0517 

0.20 

0.0277 

0.30 

0.0169 

0.40 

0.0122 

0.50 

0.0102 

OCR  RATE  (CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

0.50 

NA 

NA 

CPU  RATE: 
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SYSTEM:  UPENN 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 
[50][51][37][56][57][52][5][53]  [54][55] 
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NUMBER  WRITERS  WITH  ERROR 


UPENN  — DIGITS 


Figure  216:  Error  rate  versus  rejection  rate  for  UPENN 


UPENN 


Figure  217:  Error  rate  per  writer  of  UPENN 
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UPEN»U)iafT.CO<(nELATE 


SYSTEM  NUMBER 

Figure  218:  UPENN  - digit  correlation 


Syjiem  Number 

System  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

Ut>fcNN 

1 0000 

1 0000 

2 

VOTEJkl 

0.9183 

0 9041 

3 

AEG 

0 9124 

0.8974 

4 

ATT.4 

0 9120 

0 8942 

& 

KODAKS 

0 9119 

0 8946 

6 

ERIMJ 

0 9110 

0 8962 

7 

VOTEJ» 

0 9096 

0 8993 

S 

OCRSYS 

0.9094 

0 9031 

9 

ER1M.2 

0 9094 

0 8943 

10 

REFERENCE 

0 9092 

0 9092 

ll 

ATT  J 

0 9091 

0 8933 

12 

ELSAGB J 

0.9086 

0 8938 

13 

ELSAGB.2 

0 9080 

0 8933 

14 

ATTJ 

0,9079 

0 8961 

13 

KODAKJ 

0.9070 

0 8897 

16 

IBM 

0.9069 

0 8943 

17 

SYMBUS 

0 9063 

0 8900 

18 

NIST.4 

0 9031 

0 8873 

19 

THINK.l 

0 9047 

0 8873 

20 

HUGHES. 1 

0 9043 

0 8877 

21 

THINK.2 

0 9041 

0 8913 

22 

UBOL 

0 9041 

0 8896 

23 

ELSAGB J 

0 9033 

0 8861 

24 

HUGHES. 2 

0 9031 

0 8874 

23 

NESTOR 

0 9013 

0 8884 

26 

NYNEX 

0 9002 

0.8880 

27 

REI 

0 8998 

0 8896 

28 

ATT  J 

0.8998 

0 8867 

29 

GTESS.l 

0 8943 

0 8734 

30 

GTESS  J 

0 8941 

0 8748 

31 

NIST.l 

0 8843 

0.8634 

32 

COMCOM 

0 8819 

0 *782 

33 

GMD.3 

0,8798 

0 8620 

34 

MIME 

0 8797 

0 8399 

33 

NISTJ 

0 8797 

0.8376 

36 

ASOL 

0 8773 

0.8377 

37 

NISTJ 

0 8763 

0.8337 

38 

RISO 

0.8739 

0.84  70 

39 

GMD.l 

0.8727 

0.8338 

40 

KAMAN.l 

0.8640 

0.8387 

41 

GMD.4 

0.8394 

0.8423 

42 

KAMANJ 

0 8466 

0 8232 

43 

KAMAN.2 

0.8423 

0.8203 

44 

GMD.2 

0.8238 

0 8031 

43 

KAMAN.5 

0.8211 

0.8022 

46 

IFAX 

0 8078 

0 7862 

47 

VALEN.2 

0 8076 

0.7929 

48 

VALEN.1 

0.7936 

0 7739 

49 

KAMAN.4 

0 7740 

0 7317 

Table  134:  UPENN  correlation  graph  key  for  digits. 
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No  Data  Available 


Figure  219:  UPENN  - upper  case  correlation 

There  no  d&lA  for  this  evaluation. 

Table  135:  UPENN  correlation  graph  key  for  uppers. 
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No  Data  Available 


Figure  220:  UPENN  - lower  case  correlation 


There  no  d»iA  for  thi*  evaluation. 


Table  136:  UPENN  correlation  graph  key  for  lowers. 
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SYSTEM:  VALEN.l 


PARTICIPANT:  Enrique  Vidal 

ORGANIZATION:  Universidad  Politecnica  de  Valencia,  Valencia,  Spain 
FEATURES:  line  fit  features 
CLASSIFICATION:  KNN  or  NN  with  BP 
HARDWARE:  model  380  HP-9000 

TRAINING:  DIGITS  UPPERS  LOWERS  DATABASE 


STATUS:  on  time 

RESULTS:  — DIGITS  — — UPPERS  — — LOWERS  — DATABASE 


REJ. 

ERR. 

REJ. 

ERR. 

REJ. 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE 

RATE— 

RATE 

RATE— 

0.00 

0.1795 

0.00 

0.2418 

0.00 

0.3160 

0.10 

0.1358 

0.10 

0.2023 

0.10 

0.2813 

0.20 

0.0971 

0.20 

0.1633 

0.20 

0 . 2460 

0.30 

0 . 0647 

0.30 

0.1331 

0.30 

0 . 2096 

0.40 

0.0422 

0.40 

0.1048 

0.40 

0.1786 

0.50 

0.0275 

0.50 

0.0799 

0.50 

0.1468 

OCR  RATE  (CPS) 

: DIGITS 

UPPERS 

LOWERS 

SYS  RATE: 

5.15 

3. 

14 

3.14 

CPU  RATE: 

18.18 

5.1 

58 

5.58 
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SYSTEM:  VALEN.l 

The  following  references  have  been  provided  for  this  system: 

[58| 
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NUMBER  WRITERS  WITH  ERROR 


UJ 

H 

S 

cc 

o 

an 

CL 

UJ 


VALEN  1 — DIGITS  UPPERS  LOWERS 


Figure  221:  Error  rate  versus  rejection  rate  for  VALENJ. 


VALEN.I 


Figure  222:  Error  rate  per  writer  of  VALENJ. 
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VALEN.I  JMQrT.OMRELATE 


SYSTEM  NUUSEil 

Figure  223:  VALEN.l  - digit  correlation 


Syilem  Number 

Sy*iem  Name 

Correlation  ( all ) 

Correlation  (correct) 

VALEN.l 

1.0000 

1 0000 

2 

VOTE.M 

0 8281 

0 8162 

i 

NESTOR 

0 8261 

0.8068 

4 

AEG 

0 8229 

0 8093 

6 

ATTJ 

0 8228 

0.8091 

« 

OCRSYS 

0.8222 

0.8169 

7 

ERIM.2 

0 8218 

0.8077 

8 

ERlM.l 

0 8217 

0.8073 

9 

VOTEJ* 

0 8210 

0 8118 

10 

IBM 

0 8209 

0 8086 

11 

KODAKJ 

0 8207 

0.8061 

12 

REFERENCE 

0 8206 

0.8206 

13 

ELSAGB J 

0 8204 

0 8082 

14 

ELSAGB.2 

0.8201 

0.8079 

1& 

SYMBUS 

0.8201 

0.8048 

le 

NIST.4 

0 8192 

0.8014 

17 

ATT.4 

0.8191 

0 8062 

18 

TH1NK.1 

0 8178 

0 8016 

19 

ELSAGB.1 

0 8178 

0 8003 

20 

UBOL 

0 8177 

0.8031 

21 

KODAKJ 

0 *172 

0 8020 

22 

NYNEX 

0 8170 

0 8042 

23 

ATTa 

0 8161 

0 8071 

24 

HUGHES.2 

0 8167 

0 8010 

2S 

ATT  J 

0 8166 

0.8026 

26 

HUGHES.! 

0.8144 

0 8003 

27 

THINK.2 

0 8138 

0 8033 

28 

REI 

0 8122 

0.8032 

29 

GM0.3 

0.8069 

0.7839 

30 

RISO 

0 8068 

0.7739 

31 

GTESSU 

0 8068 

0 7899 

32 

GTESSJ 

0 8040 

0.7884 

33 

ASOL 

0 8028 

0 7*02 

34 

NIST.l 

0.8019 

0.7826 

35 

GMD.l 

0 8002 

0.7782 

36 

MIME 

0 7997 

0.7789 

37 

KAMAN.l 

0 7973 

0.7678 

38 

NISTJ 

0.7972 

0.7766 

39 

COMCOM 

0.7946 

0.7914 

40 

UPENN 

0 7938 

0.7739 

41 

NISTJ 

0.7929 

0.7729 

42 

KAMAN.^ 

0 7926 

0.7684 

43 

KAMANJ 

0 7921 

0.7667 

44 

GMD.4 

0 7888 

0.7662 

46 

KAMANJ 

0.7828 

0.7440 

46 

GMD.2 

0 7669 

0.7377 

47 

IFAX 

0.7479 

0.7196 

48 

VALEN-2 

0.7348 

0 7179 

49 

KAMAN.4 

0 7310 

0.6964 

Table  137:  VALEN.l  correlation  graph  key  for  digits. 


328 


VALEN_1.U(>PERCOIWELATE 


SYSTEM  NUMBER 

Figure  224:  VALEN.l  - upper  Ccise  correlation 


Sy*(em  Number 

System  N^me 

Correlation  (»11) 

Correifttion  (correct) 

1 

valeNu 

1 0000 

1 0000 

7 

VOTE-M 

0.7664 

0.7613 

3 

UMICHJ 

0 7606 

0 7413 

i 

ATT.4 

0.7696 

0 7418 

s 

REFERENCE 

0 7683 

0.7683 

6 

IBM 

0.7674 

0.7378 

7 

AEG 

0 7673 

0.7460 

a 

NESTOR 

0 7666 

0.7386 

9 

VOTEJ> 

0.7613 

0 7421 

10 

ERIMJ 

0 7613 

0.7372 

11 

SYMBUS 

0.7609 

0.7301 

13 

UBOL 

O.TSOJ 

0.7316 

13 

HUGHES.l 

0.7496 

0.7309 

14 

NYNEX 

0.7496 

0.7376 

1& 

ATTJ 

0.7489 

0.7308 

16 

ATTJ 

0.7487 

0.7342 

17 

KODAKJ 

0 7484 

0.7303 

IS 

HUGHES.l 

0.7479 

0.7393 

19 

ATTJ 

0.7404 

0.7363 

30 

GTESS.I 

0.7367 

0.7196 

31 

GTESSJ 

0.7360 

0 7187 

33 

MIME 

0.7349 

0.7U3 

33 

OCRSYS 

0.7334 

0.7218 

34 

NIST.4 

0.7339 

0.7076 

33 

RISO 

0.7330 

0 6968 

36 

ASOL 

0 7249 

0.7043 

IT 

KAMAN.l 

0 7219 

0 6901 

3S 

REI 

0.7169 

0.6983 

39 

GMD.l 

0.7163 

0.6886 

30 

GMD.3 

0.7117 

0.6866 

31 

NISTa 

0 7111 

0.6860 

33 

KAMAN.3 

0.6998 

0 6638 

33 

KAMANa 

0 6998 

0.6688 

34 

GMD.4 

0 6966 

0.6716 

36 

NISTJ 

0 6918 

0.6691 

36 

IFAX 

0 6660 

0.6406 

37 

COMCOM 

0.6698 

0.6640 

38 

NISTa 

0.6681 

0.6398 

39 

GMOa 

0 6640 

0.6306 

40 

KAMAN.4 

0.6619 

0.6074 

41 

KAMANa 

0.6987 

0.6696 

43 

UMICHa 

0 0987 

0.0134 

Table  138:  VALENJ.  correlation  graph  key  for  uppers. 
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VAl£Nj.LOWER.CORfCLATE 


SYSTEM  NUMBER 


Figure  225:  VALEN_1  - lower  case  correlation 


Sy<iem  Number 

System  N&me 

Correi»iion  ( all ) 

Correlation  (correct) 

1 

VALENU 

1 0000 

1.0000 

7 

VOTEJVi 

0 7122 

0 6693 

3 

UMICH-1 

0 6967 

0 6426 

4 

AEG 

0 6886 

0 6446 

h 

ATT  J 

0 6861 

0.6369 

6 

ERIM.l 

0.6869 

0.6423 

7 

ATT.4 

0 6861 

0 6419 

8 

UBOL 

0 6848 

0.6347 

9 

REFERENCE 

0.6840 

0.6840 

10 

IBM 

0 6837 

0 6368 

LI 

OCRSYS 

0 6817 

0.6418 

12 

NESTOR 

0 8812 

0 6363 

L3 

KODAKU 

0 6798 

0 6371 

L4 

RISO 

0.6790 

0.6148 

I& 

HUGHES.2 

0 6769 

0.6308 

16 

HUGHES. 1 

0 6767 

0.6309 

17 

ATT.2 

0 8747 

0.6376 

18 

NYNEX 

0 6728 

0.6369 

19 

ATTJ 

0 6718 

0.6346 

20 

VOTEJ’ 

0 6701 

0.6461 

21 

NIST.4 

0.6611 

0.6096 

22 

GTESS.l 

0 6687 

0 6168 

23 

NIST.l 

0 6672 

0.6127 

24 

GTESS  J 

0.6660 

0 6117 

26 

GMD.3 

0 6642 

0.6062 

26 

ASOL 

0 6460 

0.6987 

27 

GMD.4 

0 6428 

0.6946 

28 

GMD.l 

0 6428 

0.6946 

29 

NISTJ 

0.6208 

0.6924 

30 

GMD.2 

0.6081 

0.6686 

31 

KAMAN.l 

0.6067 

0.6468 

32 

KAMAN.3 

0.6921 

0.6312 

33 

NIST.2 

0 6868 

0.6372 

34 

KAMAN.2 

0.6794 

0,6218 

36 

KAMAN.i 

0 6263 

0.4666 

36 

KAMAN.4 

0.4939 

0 4406 

37 

COMCOM 

0.4204 

0.4100 

38 

UMICH.2 

0.1189 

0.0321 

Table  139:  VALENJ.  correlation  graph  key  for  lowers. 
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F System  Summaries  For  Late  Submitted  Results 


Jon  Geist,  Jonathan  J.  Hull,  Stanley  Janet,  R.  Allen  Wilkinson,  and  Charles  L.  Wilson 

This  appendix  contains  summaries  for  most  systems  whose  HYP  files  were  received  late. 
Some  results  that  were  received  late  were  not  included  because  they  would  not  add  anything 
to  the  report  even  though  in  some  cases  the  results  are  interesting.  In  such  cases  the  results 
are  mentioned  in  the  body  of  the  report.  The  summary  format  is  exactly  the  same  as  that 
used  for  the  summaries  in  Appendix  E. 
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SYSTEM:  NIST_4 


PARTICIPANT:  Patrick  J.  Grother 
ORGANIZATION:  NIST,  Gaithersburg,  MD 

PREPROCESSING:  Size  (both  height  and  vidth) , Slant,  Stroke  Width,  Normalization. 
Subtraction  from  binary  image  of  mean  of  training  images. 

FEATURES:  Projection  onto  principal  components  of  training  set. 

48  leading  elements  of  "KL"  transform.  Digits 

96  leading  elements  of  "KL"  transform.  Uppers 

96  leading  elements  of  "KL"  transform.  Lovers 

CLASSIFICATION:  PNN : Gaussian  distance  weighted  voting  among  all  prototypes. 

Equivalent  to  KNN  algorithm  of  NIST.l. 

HARDWARE:  Sparc  2 running  optimized  C code. 

TRAINING:  DIGITS  UPPERS  LOWERS  DATABASE 

"56000  "11000  "11000  NSDB3 

500  500  500  WRITERS 

STATUS:  submitted  after  Conference 


RESULTS:  — DIGITS  — — UPPERS  — — LOWERS  — DATABASE 


REJ. 

ERR. 

REJ. 

ERR. 

REJ. 

ERR.  TESTDATAl 

RATE 

RATE— 

RATE 

RATE— 

RATE 

RATE— 

0.00 

0.0497 

0.00 

0.1037 

0.00 

0.2001 

0.10 

0.0105 

0.10 

0.0614 

0.10 

0.1570 

0.20 

0.0064 

0.20 

0 . 0346 

0.20 

0.1199 

0.30 

0.0035 

0.30 

0.0214 

0.30 

0.0889 

0.40 

0.0021 

0.40 

0.0141 

0.40 

0.0610 

0.50 

0.0014 

0.50 

0.0092 

0.50 

0 . 0420 

OCR  RATE  (CPS) 

: DIGITS 

UPPERS 

LOWERS 

OCR  RATE: 
CPU  RATE: 
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SYSTEM:  NIST.4 

The  following  references  have  been  provided  for  this  system: 

[42] 

COMMENTS:  NIST.4 

See  Cross  Validation  Section  on  Inadequacies  of  NIST  Special  Database  3 for  the  classification  of 
NIST  Test  Data  1. 

Very  Slow  Classification.  No  exemplar  pruning  or  aggregation.  Does  not  suffer  from  "minority” 
problems  of  perceptrons  (e.g.  crossed  sevens). 

Size  normalization  enforces  32  pixel  height  24  pixel  width,  does  not  preserve  aspect  ratio.  Dilation 
/ erosion  used  to  normalize  stroke  widths.  Significant  recognition  gains  over  NIST.l. 
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ERROR  RATE  (%) 


NIST  4 --  DIGITS  UPPERS  LOWERS 


REJECTION  RATE  (%) 


Figure  226:  Error  rate  versus  rejection  rate  for  NIST_4 


M8T_4 


Figure  227:  Error  rate  per  writer  of  NIST_4 
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MaT.UMOrxORHQ^TE 


SYSTEM  NUMBER 

Figure  228:  NIST_4  - digit  correlation 


Sytiem  Number 

System  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

MisT-i 

1 0000 

1.0000 

2 

VOTEJ4 

0 9630 

0 9457 

3 

AEG 

0 9598 

0.9397 

4 

VOTE-P 

0 9524 

0 9403 

& 

ELSAGB J 

0 9523 

0 9369 

6 

ELSAGB^ 

0 9517 

0 9365 

7 

UBOL 

0 9506 

0 9316 

8 

REFERENCE 

0 9503 

0.9503 

9 

OCRSYS 

0 9499 

0 9435 

10 

ERIM.1 

0 9496 

0 9334 

11 

ATTJ 

0 9485 

0 9357 

12 

ERIM.2 

0,94  79 

0.9323 

L3 

ATT.4 

0 94  79 

0.9315 

14 

KODAKS 

0.9471 

0 9314 

IS 

ATTJ 

0 9460 

0 9327 

16 

THINK.l 

0 9454 

0 9265 

17 

ELSAGB.l 

0 9439 

0 9249 

18 

IBM 

0.9433 

0 9323 

19 

ATTJ 

0 9420 

0 9262 

20 

SYMBUS 

0.9411 

0.9260 

21 

KODAKJ 

0.9409 

0.9255 

22 

NESTOR 

0 9392 

0 9260 

23 

THINK  J 

0 9385 

0.9280 

24 

HUGHES. 1 

0.9377 

0 9237 

25 

HUGHES.2 

0 9374 

0 9235 

26 

NYNEX 

0.9337 

0 9238 

27 

REI 

0 9334 

0 9253 

28 

GTESS.2 

0.9280 

0 9096 

29 

GTESS.l 

0 9277 

0 9101 

30 

NIST.l 

0 9247 

0.9028 

31 

GMD.3 

0 9225 

0.9001 

32 

COMCOM 

0 9170 

0 9141 

33 

GMD.l 

0 9138 

0.8932 

34 

MIME 

0 9110 

0 8930 

35 

ASOL 

0 9097 

0.8907 

36 

NISTJ 

0 9094 

0 8896 

37 

NISTJ 

0 9071 

0 8858 

38 

UPENN 

0 9051 

0 8873 

39 

RISC 

0 9008 

0,8777 

40 

GMD.4 

0 8977 

0.8783 

41 

KAMAN.l 

0 8914 

0 8694 

42 

K AMANJ 

0 8766 

0.8544 

43 

KAMAN.2 

0 8742 

0.8620 

44 

GMD.2 

0.8504 

0.8305 

45 

KAMAN.5 

0 8503 

0.8319 

46 

VALENJ 

0 8343 

0.8227 

47 

IFAX 

0 8289 

0 8118 

48 

VALEN.l 

0 8192 

0 8014 

49 

KAMAN.4 

0 8036 

0.7813 

Table  140:  NIST_4  correlation  graph  key  for  digits. 
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N0T_4.UPPERCOM)ELATE 


SYSTEM  NUMBER 


Figure  229;  NIST_4  - upper  Ccise  correlation 


System  Number 

Sy5iem  Name 

Correlation  (all) 

Correlation  (correct) 

1 

NIST.4 

l.OOOO 

1.0000 

2 

VOTEJW 

0.9070 

0.8893 

3 

AEG 

0.8972 

0 8820 

4 

REFERENCE 

0 8963 

0.8963 

& 

UBOL 

0 8941 

0 8692 

6 

ATT.4 

0.8937 

0.8733 

7 

UMICH.l 

0 8886 

0 8714 

» 

VOTE-P 

0.8871 

0.8763 

9 

ERIM.1 

0.8863 

0 8703 

10 

NYNEX 

0 8829 

0 8704 

11 

ATTJ 

0.8818 

0 8617 

12 

NESTOR 

0 8813 

0 8638 

13 

ATTJ 

0 8812 

0.8638 

14 

ATT  J 

0.8807 

0 8619 

13 

KODAKU 

0 8807 

0 8608 

16 

IBM 

0.8773 

0.8617 

17 

HUGHES-1 

0 8774 

0.8602 

18 

SYMBUS 

0 8764 

0 8367 

19 

HUGHES.2 

0 8739 

0 8384 

20 

GTESS.1 

0 8683 

0 8601 

21 

MIME 

0.8669 

0 8401 

22 

GTESS.2 

0 8663 

0.8488 

23 

OCRSYS 

0 8687 

0.8476 

24 

ASOL 

0.8308 

0 8283 

23 

NIST.l 

0 8467 

0.8133 

26 

GMD.l 

0.8464 

0 8133 

27 

GMD.3 

0 8442 

0 8113 

28 

RISC 

0 8403 

0 8096 

29 

REI 

0 8328 

0 8133 

30 

GMD.4 

0.8282 

0.7937 

31 

KAMAN.l 

0.8189 

0 7967 

32 

NISTJ 

0.7991 

0 7803 

33 

KAMAN.3 

0.7797 

0 7338 

34 

COMCOM 

0.7728 

0-7669 

33 

K AMAN.2 

0 7726 

0.7470 

36 

IFAX 

0.7713 

0.7307 

37 

NIST.2 

0 7333 

0 7278 

38 

GMD.2 

0 7363 

0-7123 

39 

VALEN.1 

0 7329 

0.7076 

40 

KAMAN-4 

0.7139 

0.6880 

41 

KAMAN-i 

0.6414 

0 6213 

42 

UMICH-2 

0 0347 

0.0177 

Table  141:  NIST_4  correlation  graph  key  for  uppers. 
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MaT_4XOWERCORRELATE 


SYSTEM  NUU8ER 

Figure  230:  NIST_4  - lower  Ccise  correlation 


Sy«iem  Number 

Syilem  Name 

Correlation  ( all ) 

Correlation  (correct) 

1 

NisT-'t 

1 0000 

1.0000 

7 

VOTE-M 

0 8378 

0 7830 

UBOL 

0 8145 

0 7449 

4 

AEG 

0 8144 

0.7565 

5 

ERIM-1 

0 8052 

0 7508 

6 

REFERENCE 

0.7999 

0 7999 

7 

UMICH.l 

0.7971 

0 7402 

8 

ATT  J 

0 7960 

0 7376 

9 

OCRSYS 

0 7928 

0-7431 

10 

KODAK J 

0 7921 

0 7408 

11 

ATTa 

0 7885 

0 7403 

12 

ATTa 

0 7851 

0 7402 

13 

ATT.4 

0 7847 

0.7387 

14 

NYNEX 

0 7843 

0 7396 

li 

HUGHES. 1 

0 7817 

0 7323 

16 

HUGHES_2 

0 7782 

0 7295 

17 

VOTEJ» 

0 7781 

0 7492 

18 

IBM 

0 7762 

0 7312 

19 

NESTOR 

0 7744 

0 7317 

20 

NIST.l 

0 7699 

0-7144 

21 

GM0.3 

0 7698 

0 7067 

22 

RISO 

0 7678 

0-7017 

23 

GTESS.I 

0-7628 

0 7153 

24 

GMD.4 

0,7541 

0 6915 

25 

GMD.l 

0 7541 

0 6915 

26 

GTESS J 

0.7523 

0 7067 

27 

ASOL 

0 7323 

0 6857 

28 

NISTJJ 

0 7171 

0 6888 

29 

GMD  J 

0 6798 

0 6330 

30 

KAMAN.l 

0 6733 

0 6166 

31 

NIST.2 

0 6631 

0 6140 

32 

VALEN.l 

0 6611 

0 6096 

33 

KAMAN.3 

0 6516 

0 5969 

34 

KAMAN.2 

0,6360 

0 5839 

35 

KAMAN.4 

0.5602 

0 5135 

36 

KAMAN-4 

0 5388 

0.4908 

37 

COMCOM 

0 4688 

0 4564 

38 

UMICH.2 

0 1091 

0.0434 

Table  142:  NIST_4  correlation  graph  key  for  lowers. 
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SYSTEM:  THINK.2 


PARTICIPANT:  Stephen  Smith 

ORGANIZATION:  Thinking  Machines  Corporation,  Cambridge,  MA 
PREPROCESSING:  Thinning  and  normalization. 

FEATURES:  contour  model  of  arc 

CLASSIFICATION:  KNN  for  variable  length  vectors 

HARDWARE:  32,768  processor  CM2  with  SUN  front  end 


TRAINING: 

DIGITS 

UPPERS 

LOWERS 

DATABASE 

all 

NA 

NA 

NSDB3 

STATUS:  one  day  late 

RESULTS:  — DIGITS  — — UPPERS  --  — LOWERS  — DATABASE 

REJ.  ERR.  REJ.  ERR.  REJ . ERR.  TESTDATAl 

RATE  RATE—  RATE  RATE--  RATE  RATE— 

0.00  0.0385 

0.10  0.0086 
0.20  0.0036 

0.30  0.0019 

0.40  0.0012 

0.50  0.0008 

OCR  RATE  (CPS) : DIGITS  UPPERS  LOWERS 

SYS  RATE : 0.67 

CPU  RATE: 
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SYSTEM:  THINK.2 
BIBLIOGRAPHY: 

The  following  references  have  been  provided  for  this  system: 
none 
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ERROR  RATE  (%) 


100.0 


THINK  2 — DIGITS 


Figure  231;  Error  rate  versus  rejection  rate  for  THINK_2 


T>«4IC_2 


Figure  232:  Error  rate  per  writer  of  THINKS 
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THmK_2J>IQn'.CORRELATE 


SYSTEM  NUMBER 


Figure  233:  THINK_2  - digit  correlation 


System  N u m ber 

Syilem  Name 

Correiftlion  ( all ) 

Correlation  (correct) 

1 

THINKS 

I 0000 

1 0000 

2 

REFERENCE 

0 9615 

0 9615 

3 

VOTE-M 

0 9600 

0 9504 

4 

OCRSYS 

0 9591 

0 9537 

& 

AEG 

0 9523 

0 9420 

6 

ATTJ 

0 9506 

0 9422 

7 

ELSAGBJS 

0 9502 

0.9413 

» 

IBM 

0 9501 

0 9407 

9 

ELSAGB J 

0,9499 

0.9410 

10 

VOTE_P 

0 9494 

0.9419 

11 

ATT.2 

0.9484 

0.9394 

12 

ERIMU 

0,9477 

0 9376 

13 

ERIM-2 

0.9474 

0 9375 

14 

KODAKS 

0.9450 

0 9352 

15 

ATT.4 

0 9443 

0.9351 

16 

REl 

0 9434 

0 9355 

17 

HUGHES-l 

0 9426 

0 9311 

16 

UBOL 

0 9423 

0 9328 

19 

NYNEX 

0.9419 

0.9328 

20 

HUGHES. 2 

0 9416 

0.9306 

21 

NESTOR 

0 9404 

0 9311 

22 

THINK.! 

0 9401 

0.9293 

23 

KODAK J 

0 9397 

0 9295 

24 

SYMBUS 

0 9389 

0 9297 

25 

NIST.4 

0.9385 

0.9280 

26 

ELSAGB.l 

0 9384 

0 9277 

27 

ATTJ 

0 9369 

0.9284 

28 

COMCOM 

0 9302 

0.9264 

29 

GTESSU 

0 9213 

0 9124 

30 

GTESS.2 

0 9206 

0 9113 

31 

NIST.l 

0.9108 

0.9012 

32 

GMD.S 

0 9081 

0 8980 

33 

UPENN 

0 9041 

0.8915 

34 

MIME 

0 9030 

0 8932 

35 

GMD-1 

0 9018 

0.8920 

36 

ASOL 

0 9018 

0 8917 

37 

NIST.2 

0 8964 

0 8876 

38 

NISTJ 

0 8916 

0 8828 

39 

GMD.4 

0 8873 

0.8778 

40 

RISC 

0 8869 

0.8759 

41 

KAMAN.l 

0 8786 

0.8674 

42 

KAMAN.3 

0 8630 

0 8518 

43 

KAMANJ 

0.8595 

0.8489 

44 

KAMAN.5 

0.8420 

0.8317 

45 

GMDJJ 

0.8387 

0.8289 

46 

valenj 

0 8385 

0 8272 

47 

IFAX 

0.8249 

0 8139 

48 

VALEN.l 

0.8138 

0.8033 

49 

KAMAN.4 

0,7865 

0.7766 

Table  143:  THINK_2  correlation  graph  key  for  digits. 
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No  Data  Available 


Figure  234:  THINK_2  - upper  case  correlation 


There  no  dAtA  for  this  evAluAtion 


Table  144:  THINK_2  correlation  graph  key  for  uppers. 
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No  Data  Available 


Figure  235:  THINK_2  - lower  case  correlation 


There  w%4  no  d4t»  for  thi«  evaluation. 


Table  145:  THINK_2  correlation  graph  key  for  lowers. 
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SYSTEM:  UMICH.l 


PARTICIPANT:  M.  Shridhar 


ORGANIZATION:  University  of  Michigan,  Dearborn,  MI 
PREPROCESSING:  size  normalization. 


FEATURES:  rule-based  featixres  of  all  sorts,  histogram  of  direction 
vectors  (4  directions)  evaluated  in  16  zones.  Provides 
a 64  dimensional  featiire  vector. 

CLASSIFICATION:  hybrid  statistical,  structural,  and  NN.  Used 
modified  quadratic  discriminant  function. 


HARDWARE: 


TRAINING: 

DIGITS 

UPPERS 

600 

600 

STATUS : 

five  days 

late 

RESULTS : 

DIGITS 

UPPERS 

LOWERS  DATABASE 

600  NSDB3? 

LOWERS  DATABASE 


— DIGITS  — 

— UPPERS  — 

— LOWERS  — 

DATABASE 

REJ.  ERR. 

REJ. 

ERR. 

REJ. 

ERR. 

TESTDATAl 

RATE  RATE— 

RATE 

RATE— 

RATE 

RATE— 

0.00 

0.0511 

0.00 

0.1508 

0.10 

0.0337 

0.10 

0.1198 

0.20 

0.0256 

0.20 

0.1012 

0.30 

0.0207 

0.30 

0.0912 

0.40 

0.0179 

0.40 

0.0811 

0.50 

0.0172 

0.50 

0 . 0720 

OCR  RATE  (CPS) : DIGITS  UPPERS 


LOWERS 


SYS  RATE: 


CPU  RATE: 
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SYSTEM:  UMICH.1 

The  following  references  have  been  provided  for  this  system: 


345 


ERROR  RATE  (*) 


UMICH  1 --  UPPERS  LOWERS 


REJECTION  RATE  (%) 


Figure  236:  Error  rate  versus  rejection  rate  for  UMICH.l 


umk:h_i 


Figure  237:  Error  rate  per  writer  of  UMICH_1 
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No  Data  Available 


Figure  238:  UMICH.l  - digit  correlation 


There  wh*  no  for  thU  ev«iu4tion. 


Table  146:  UMICH.l  correlation  graph  key  for  digits. 
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UMCH_1.UPPERCOmELATE 


SYSTEM  NtJMBER 


Figure  239:  UMICH_1  - upper  case  correlation 


Sy4i«m  Number 

System  Name 

Correlation  ( ail ) 

Correlation  (correct) 

1 

UMICH.l 

l.OOOO 

1.0000 

2 

VOTEJH 

0 9513 

0.9369 

3 

REFERENCE 

0 9489 

0.9489 

4 

AEG 

0 9450 

0.9313 

S 

ATT.4 

0 9361 

0.9206 

6 

ERIM.l 

0 9323 

0-9181 

7 

NYNEX 

0 931 1 

0 9190 

8 

NESTOR 

0 9297 

0.9141 

9 

IBM 

0 9274 

0.9106 

10 

ATT.2 

0.9265 

0.9131 

11 

VOTEJ> 

0 9240 

0 9149 

12 

UBOL 

0 9231 

0 9080 

13 

HUGHES. 1 

0 9199 

0-9056 

14 

HUGHES.2 

0 9179 

0.9034 

15 

ATT  J 

0 9166 

0 9034 

18 

KODAKJ 

0 9160 

0 9022 

IT 

ATT  J 

0.9140 

0 9022 

18 

SYMBUS 

0.9133 

0 8993 

19 

OCRSYS 

0 9088 

0 8968 

20 

GTESS-1 

0 8997 

0.8883 

21 

GTESS  J 

0 8986 

0 8869 

22 

NIST.4 

0 8886 

0 8714 

23 

MIME 

0 8879 

0 8731 

24 

ASOL 

0-8781 

0 8633 

2i 

REI 

0 8694 

0 8561 

26 

RISC 

0 8552 

0 8383 

27 

GMD.l 

0 8551 

0.8390 

28 

GMD.3 

0 8539 

0.8372 

29 

NIST.l 

0.8491 

0 8362 

30 

KAMAN.l 

0.8491 

0.8306 

31 

GMD.4 

0.8344 

0.8197 

32 

COMCOM 

0 8132 

0 8066 

33 

NISTJ 

0 8100 

0 8024 

34 

KAMANJJ 

0.8018 

0 7849 

35 

IFAX 

0 7992 

0.7834 

36 

KAMAN.2 

0 7931 

0.7756 

37 

VALEN.1 

0 7605 

0 7413 

38 

NIST.2 

0,7598 

0.7478 

39 

GMD.2 

0.7488 

0 7356 

40 

KAMAN.4 

0 7275 

0 7108 

41 

KAMAN.i 

0 6592 

0.6458 

42 

UMICH.2 

0 0000 

0.0000 

Table  147:  UMICH.l  correlation  graph  key  for  uppers. 
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UMCH.IXOWERCORREIATC 


SYSTEM  NUMBER 


Figure  240;  UMICH_1  - lower  Ccise  correlation 


Sy^iem  Number 

System  N^me 

Correlation  ( all ) 

Correlation  (correct) 

1 

UWicH.i 

1 0000 

1 0000 

2 

VOTE^ 

0 8728 

0 8246 

3 

AEG 

0.8621 

0 7987 

4 

REFERENCE 

0.8493 

0 8493 

S 

OCRSYS 

0 84  72 

0 7941 

6 

ERIM.l 

0.8378 

0-7886 

7 

UBOL 

0.8373 

0.7800 

8 

IBM 

0.8339 

0.7817 

9 

ATT  J 

0 8273 

0 7747 

10 

ATT.2 

0 8239 

0 7818 

IL 

HUGHES. 1 

0 8232 

0 7732 

12 

HUGHES.2 

0.8227 

0.7723 

13 

NESTOR 

0.8221 

0 7761 

14 

ATT.4 

0 8210 

0 7791 

16 

KODAKJ 

0 8174 

0 7766 

16 

ATTa 

0.8168 

0.7767 

17 

NYNEX 

0.8121 

0.7767 

18 

VOTE_P 

0 8062 

0.7803 

19 

NIST.4 

0.7971 

0.7402 

20 

RISO 

0 7968 

0.7369 

21 

GTESS.1 

0.7943 

0.7617 

22 

NIST-1 

0 7820 

0.7408 

23 

GTESS.2 

0.7799 

0 7407 

24 

GMD.3 

0 7713 

0.7266 

26 

ASOL 

0.7603 

0 7198 

26 

GMD-4 

0 7671 

0 7117 

27 

GMD-l 

0.7671 

0 7117 

28 

NIST.a 

0 7301 

0 7120 

29 

GMD.2 

0.7022 

0 6610 

30 

VALEN.l 

0 6967 

0 6426 

31 

KAMAN-1 

0 6890 

0 6428 

32 

NIST-2 

0 6718 

0 6364 

33 

KAMANJJ 

0 6670 

0 6216 

34 

KAMAN-2 

0 64  76 

0 6062 

36 

KAMAN-S 

0 6727 

0.6363 

36 

KAMAN-4 

0.6384 

0-6062 

37 

COMCOM 

0 4923 

0 4823 

38 

UMICH.2 

0 0000 

0 0000 

Table  148:  UMICH_1  correlation  graph  key  for  lowers. 
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SYSTEM:  VALEN.2 


PARTICIPANT:  Enrique  Vidal 

ORGANIZATION:  Universidad  Politecnica  de  Valencia,  Valencia,  Spain 


FEATURES:  line  fit  features 
CLASSIFICATION:  k-NN  or  NN  with  BP 
HARDWARE:  model  380  HP-9000 

TRAINING:  DIGITS  UPPERS  LOWERS  DATABASE 


STATUS:  seven  days  late 

RESULTS:  — DIGITS  — --  UPPERS  — — LOWERS  — DATABASE 


REJ. 

RATE 

0.00 

0.10 

0.20 

0.30 

0.40 

0.50 


ERR. 

RATE— 


,1575 

,1144 

,0756 

,0488 

0307 

0192 


REJ.  ERR.  REJ.  ERR.  TESTDATAl 

RATE  RATE—  RATE  RATE— 


OCR  RATE  (CPS) : DIGITS 


UPPERS 


LOWERS 


SYS  RATE: 


CPU  RATE: 
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SYSTEM:  VALENT 

The  following  references  have  been  provided  for  this  system: 
|59] 
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NUMBER  WRITERS  WHH  ERROR  « E 


VALEN  2 --  DIGITS 


REJECTION  RATE  (%) 


Figure  241:  Error  rate  versus  rejection  rate  for  VALEN_2 


VALEM_2 


Figure  242:  Error  rate  per  writer  of  VALEN_2 
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VALEN_2.BIQIT.CORnELATl 


SYSTEM  NUMBER 

Figure  243:  VALEN_2  - digit  correlation 


System  Number 

System  Name 

Correlation  (aUl 

Correlation  (correct) 

1 

VaLEN.2 

1.0000 

l.OOOO 

2 

VOTEJH 

0.8467 

0.8362 

3 

ERIM.1 

0.8429 

0.8301 

4 

AEG 

0.8427 

0.8316 

& 

REFERENCE 

0.8426 

0 8426 

6 

OCRSYS 

0.8419 

0.8366 

7 

KODAKS 

0.8409 

0.8279 

ft 

IBM 

0.8403 

0.8296 

9 

ATTJ 

0.8396 

0.8288 

10 

KODAKJ 

0.8393 

0.8247 

U 

VOTEJ' 

0.8392 

0 8323 

12 

ELSAGB J 

0.8387 

0 8296 

13 

THINK  J 

0.8386 

0.8272 

14 

ELSAGB J 

0 8384 

0.8293 

13 

ATTJ 

0.8382 

0.8298 

16 

ERIM^ 

0.8379 

0.8276 

17 

NESTOR 

0.8376 

0 8246 

19 

HUGHES. 1 

0 8371 

0.8234 

19 

ATT.4 

0.8370 

0,8267 

20 

UBOL 

0 8366 

0.8261 

21 

HUGHES.2 

0.8360 

0.8228 

22 

NYNEX 

0.8346 

0.8237 

23 

NIST.4 

0.8343 

0.8227 

24 

SYMBUS 

0.8341 

0.8223 

23 

REI 

0 8332 

0.8241 

26 

ELSAGB.l 

0.8327 

0.8203 

27 

ATTJ 

0 8326 

0.8219 

28 

THINK.1 

0 8308 

0 8201 

29 

GTESS.2 

0.8220 

0 8102 

30 

GTESS.l 

0.8212 

0 8102 

31 

COMCOM 

0 8196 

0.8169 

32 

GMD.S 

0 8128 

0.7990 

33 

NIST.l 

0 8121 

0.7997 

34 

MIME 

0 8096 

0.7960 

36 

ASOL 

0 8092 

0.7944 

36 

UPENN 

0,8076 

0.7929 

37 

GMD.l 

0.8069 

0.7934 

38 

NISTJ 

0.8066 

0.7922 

39 

NIST.2 

0.8039 

0.7909 

40 

RISO 

0.7994 

0.7832 

41 

KAMAN.l 

0.7961 

0.7782 

42 

GMD.4 

0.7944 

0.7807 

43 

KAMANJ 

0.7847 

0.7670 

44 

KAMANJ 

0.7834 

0.7666 

46 

KAMANJ 

0.7696 

0.7606 

46 

IFAX 

0.7626 

0.7391 

47 

GMD-2 

0.7694 

0.7438 

48 

VALEN-1 

0.7348 

0.7179 

49 

KAMAN.4 

0.7180 

0.7006 

Table  149:  VALENT  correlation  graph  key  for  digits. 
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No  Data  Available 


Figure  244:  VALENT  - upper  case  correlation 


There  W4«  no  d4t4  for  ihi«  ev4lu4tion. 


Table  150:  VALEN_2  correlation  graph  key  for  uppers. 
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No  Data  Available 


Figure  245:  VALEN_2  - lower  case  correlation 


There  W4«  no  dftt*  for  thi«  ev^UAlion. 


Table  151:  VALENT  correlation  graph  key  for  lowers. 
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